ddf_utils.model package

Submodules

ddf_utils.model.ddf module

The DDF model

class ddf_utils.model.ddf.Concept(id: str, concept_type: str, props: dict = NOTHING)

Bases: object

to_dict()
class ddf_utils.model.ddf.DDF(concepts: Dict[str, ddf_utils.model.ddf.Concept] = NOTHING, entities: Dict[str, ddf_utils.model.ddf.EntityDomain] = NOTHING, datapoints: Dict[str, Dict[str, ddf_utils.model.ddf.DataPoint]] = NOTHING, synonyms: Dict[str, ddf_utils.model.ddf.Synonym] = NOTHING, props: dict = NOTHING)

Bases: object

get_datapoints(i, by=None)
get_entities(domain, eset=None)
get_synonyms(concept)

get synonym dictionary. return None if no synonyms for the concept.

indicators(by=None)
class ddf_utils.model.ddf.DaskDataPoint(id: str, dimensions: Tuple[str], path: Union[List[str], str], concept_types: dict, read_csv_options: dict = NOTHING, store='dask')

Bases: ddf_utils.model.ddf.DataPoint

load datapoints with dask

data
class ddf_utils.model.ddf.DataPoint(id: str, dimensions: Tuple[str], store: str)

Bases: abc.ABC

A DataPoint object stores a set of datapoints which have same dimensions and which belongs to only one indicator.

data
class ddf_utils.model.ddf.Entity(id: str, domain: str, sets: List[str], props: dict = NOTHING)

Bases: object

to_dict(pkey=None)

create a dictionary containing name/domain/is–headers/and properties So this can be easily plug in pandas.DataFrame.from_records()

class ddf_utils.model.ddf.EntityDomain(id: str, entities: List[ddf_utils.model.ddf.Entity] = NOTHING, props: dict = NOTHING)

Bases: object

add_entity(ent: ddf_utils.model.ddf.Entity)
entity_ids
entity_sets
classmethod from_entity_list(domain_id, entities, allow_duplicated=True, **kwargs)
get_entity_set(s)
has_entity(sid)
to_dict(eset=None)
class ddf_utils.model.ddf.PandasDataPoint(id: str, dimensions: Tuple[str], path: str, dtypes: dict, store='pandas')

Bases: ddf_utils.model.ddf.DataPoint

load datapoints with pandas

data
class ddf_utils.model.ddf.Synonym(concept_id: str, synonyms: Dict[str, str])

Bases: object

to_dataframe()
to_dict()

ddf_utils.model.package module

datapackage model

class ddf_utils.model.package.DDFSchema(primaryKey: List[str], value: str, resources: List[str])

Bases: object

classmethod from_dict(d: dict)
class ddf_utils.model.package.DDFcsv(base_path: str, resources: List[ddf_utils.model.package.Resource], props: dict = NOTHING, ddfSchema: Dict[str, List[ddf_utils.model.package.DDFSchema]] = NOTHING)

Bases: ddf_utils.model.package.DataPackage

DDFCSV datapackage.

static entity_domain_to_categorical(domain: ddf_utils.model.ddf.EntityDomain)
static entity_set_to_categorical(domain: ddf_utils.model.ddf.EntityDomain, s: str)
classmethod from_dict(d_: dict, base_path='./')
generate_ddf_schema(progress_bar=False)

generate ddf schema from all resources.

Parameters:progress_bar (bool) – whether progress bar should be shown when generating ddfSchema.
get_ddf_schema(update=False)
load_ddf()

-> DDF

to_dict()

dump the datapackage to disk

class ddf_utils.model.package.DataPackage(base_path: str, resources: List[ddf_utils.model.package.Resource], props: dict = NOTHING)

Bases: object

classmethod from_dict(d_: dict, base_path='./')
classmethod from_json(json_path)
classmethod from_path(path)
to_dict()

dump the datapackage to disk

class ddf_utils.model.package.Resource(name: str, path: str, schema: ddf_utils.model.package.TableSchema)

Bases: object

classmethod from_dict(d: dict)
to_dict()
class ddf_utils.model.package.TableSchema(fields: List[dict], primaryKey: Union[List[str], str])

Bases: object

Table Schema Object Class

common_fields
field_names
classmethod from_dict(d: dict)

ddf_utils.model.repo module

model for dataset repositories.

class ddf_utils.model.repo.Repo(uri, base_path=None)

Bases: object

local_path
name
show_versions()
to_datapackage(ref=None)

turn repo@ref into Datapackage

ddf_utils.model.repo.is_url(r)

ddf_utils.model.utils module

ddf_utils.model.utils.absolute_path(path: str) → str
ddf_utils.model.utils.sort_json(dp)

sort json object. dp means datapackage.json