data
¶
-
lovelyrita.data.
get_column_names
(path, valid_column_names=['badge_number', 'city', 'fine_amount', 'latitude', 'longitude', 'state', 'street', 'street_name', 'street_number', 'street_suffix', 'ticket_issue_date', 'ticket_issue_time', 'ticket_number', 'violation_desc_long', 'violation_external_code', 'voided'])[source]¶ Return the intersection of columns present in the dataset and valid column names
-
lovelyrita.data.
get_sample_value
(series)[source]¶ Return a sample value from a series
Parameters: series : pandas.Series Returns: A sample value from the series or None if all values in the series are null
-
lovelyrita.data.
read_data
(paths, usecols=None, delimiter=', ', clean=False)[source]¶ Load data from a list of file paths.
Parameters: paths : list
A list of file paths to the data to be loaded
dtype : dict
A dict containing key (column name) and value (data type)
delimiter : str
Returns: A DataFrame containing the loaded data
-
lovelyrita.data.
summarize
(dataframe)[source]¶ Generate a summary of the data in a dataframe.
Parameters: dataframe : pandas.DataFrame
Returns: A DataFrame containing the data type, number of unique values, a sample value, number and
percent of null values
-
lovelyrita.data.
to_geodataframe
(dataframe, copy=False, drop_null_geometry=True, projection='epsg:4326')[source]¶ Convert a pandas DataFrame to geopandas DataFrame.
Parameters: dataframe : pandas.DataFrame
Must contain latitude and longitude fields
copy : bool
drop_null_geometry : bool
projection : str
Returns: A GeoDataFrame of the given DataFrame