ddf_utils.chef package

Submodules

ddf_utils.chef.api module

APIs for chef

ddf_utils.chef.api.run_recipe(fn, ddf_dir, out_dir)

run the recipe file and serve result

ddf_utils.chef.exceptions module

exceptions for chef

exception ddf_utils.chef.exceptions.ChefRuntimeError

Bases: Exception

exception ddf_utils.chef.exceptions.IngredientError

Bases: Exception

exception ddf_utils.chef.exceptions.ProcedureError

Bases: Exception

ddf_utils.chef.helpers module

ddf_utils.chef.helpers.build_dictionary(chef, dict_def, ignore_case=False, value_modifier=None)

build a dictionary from a dictionary definition

ddf_utils.chef.helpers.build_dictionary_from_dataframe(df, keys, value, ignore_case=False)
ddf_utils.chef.helpers.build_dictionary_from_file(file_path)
ddf_utils.chef.helpers.create_dsk(data, parts=10)

given a dictionary of {string: pandas dataframe}, create a new dictionary with dask dataframe

ddf_utils.chef.helpers.debuggable(func)

return a function that accepts debug as keyword parameters.

ddf_utils.chef.helpers.dsk_to_pandas(data)

The reverse for create_dsk function

ddf_utils.chef.helpers.gen_query(conds, scope=None, available_scopes=None)

generate dataframe query from mongo-like queries

ddf_utils.chef.helpers.gen_sym(key, others=None, options=None)

generate symbol for chef ingredient/procedure result

ddf_utils.chef.helpers.get_procedure(procedure, base_dir)

return a procedure function from the procedure name

Parameters:
  • procedure (str) – the procedure to get, supported formats are 1. procedure: sub/dir/module.function 2. procedure: module.function
  • base_dir (str) – the path for searching procedures
ddf_utils.chef.helpers.make_abs_path(path, base_dir)

return a absolute path from a relative path and base dir.

If path is absoulte path arleady, it will ignore base dir and return path as is.

ddf_utils.chef.helpers.mkfunc(options)

create function warppers base on the options provided

This function is used in procedures which have a function block. Such as ddf_utils.chef.procedure.groupby(). It will try to return functions from numpy or ddf_utils.ops.

Parameters:options (str or dict) – if a dictionary provided, “function” should be a key in the dictionary
ddf_utils.chef.helpers.prompt_select(selects, text_before=None)

ask user to choose in a list of options

ddf_utils.chef.helpers.query(df, conditions, available_scopes=None)

query a dataframe with mongo-like queries

ddf_utils.chef.helpers.read_opt(options, key, required=False, default=None, method='get')

utility to read an attribute from an options dictionary

Parameters:
  • options (dict) – the option dictionary to read
  • key (str) – the key to read
Keyword Arguments:
 
  • required (bool) – if true, raise error if the key is not in the option dict
  • default (object) – a default to return if key is not in option dict and required is false
ddf_utils.chef.helpers.sort_df(df, key, sort_key_columns=True, custom_column_order=None)

Sorting df columns and rows.

Parameters:
  • df (pd.DataFrame) – DataFrame to sort
  • key (str or list) – columns of dataframe, to be used as sorting key(s)
Keyword Arguments:
 
  • sort_key_columns (bool) – whehter to sort index column orders. If false index columns will retain the order of key parameter.
  • custom_column_order (dict) – column weights for columns except keys. Columns not mentioned will have 0 weight. Bigger weight means higher rank.

ddf_utils.chef.ops module

commonly used calculation methods

ddf_utils.chef.ops.aagr(df: <Mock name='mock.DataFrame' id='139917546889360'>, window: int = 10)

average annual growth rate

Parameters:window (int) – the rolling window size
Returns:return – The rolling apply result
Return type:DataFrame
ddf_utils.chef.ops.between(x, lower, upper, how='all', include_upper=False, include_lower=False)
ddf_utils.chef.ops.gt(x, val, how='all', include_eq=False)
ddf_utils.chef.ops.lt(x, val, how='all', include_eq=False)
ddf_utils.chef.ops.zcore(x)