data

lovelyrita.data.get_column_names(path, valid_column_names=['badge_number', 'city', 'fine_amount', 'latitude', 'longitude', 'state', 'street', 'street_name', 'street_number', 'street_suffix', 'ticket_issue_date', 'ticket_issue_time', 'ticket_number', 'violation_desc_long', 'violation_external_code', 'voided'])[source]

Return the intersection of columns present in the dataset and valid column names

lovelyrita.data.get_sample_value(series)[source]

Return a sample value from a series

Parameters:series : pandas.Series
Returns:A sample value from the series or None if all values in the series are null
lovelyrita.data.read_data(paths, usecols=None, delimiter=', ', clean=False)[source]

Load data from a list of file paths.

Parameters:

paths : list

A list of file paths to the data to be loaded

dtype : dict

A dict containing key (column name) and value (data type)

delimiter : str

Returns:

A DataFrame containing the loaded data

lovelyrita.data.summarize(dataframe)[source]

Generate a summary of the data in a dataframe.

Parameters:

dataframe : pandas.DataFrame

Returns:

A DataFrame containing the data type, number of unique values, a sample value, number and

percent of null values

lovelyrita.data.to_geodataframe(dataframe, copy=False, drop_null_geometry=True, projection='epsg:4326')[source]

Convert a pandas DataFrame to geopandas DataFrame.

Parameters:

dataframe : pandas.DataFrame

Must contain latitude and longitude fields

copy : bool

drop_null_geometry : bool

projection : str

Returns:

A GeoDataFrame of the given DataFrame

lovelyrita.data.write_shapefile(geodataframe, path)[source]

Write a geodataframe to a shapefile.

Parameters:

geodataframe : geopandas.GeoDataFrame

path : str