Recipe Cookbook (draft)¶
This document assumes that readers already know the DDF data model. If you are not familiar with it, please refer to DDF data model document.
What is Recipe¶
Recipe is a Domain-specific language(DSL) for DDF datasets, to manipulate existing datasets and create new ones. By reading and running the recipe executor(Chef), one can generate a new dataset easily with the procedures we provide. Continue with Write Your First Recipe or Structure of a Recipe to learn how to use recipes. You can also check examples here
Write Your First Recipe¶
In this section we will go though the usage of recipes. Suppose you are a data provider and you have access to 2 DDF datasets, ddf–gapminder–population and ddf–bp–energy. Now you want to make a new dataset, which contains oil consumption per person data for each country. Let’s do it with Recipe!
0. Prologue¶
Before you begin, you need to create a project. With the command line tool
ddf, we can get a well designed dataset project directory.
Just run ddf new
and answer some questions. The script will generate the
directory for you. When it’s done we are ready to write the recipe.
As you might notice, there is a template in project_root/recipes/, and it’s in YAML format. Chef support recipes in both YAML and JSON format, but we recommend YAML because it’s easier to write and read. In this document we will use YAML recipes.
1. Add basic info¶
First of all, we want to add some meta data to describe the target dataset. This
kind of information can be written in the info
section. The info
section
is optional and these metadata will be also written in the datapackage.json
in the final output. You can put anything you want in this section. In our
example, we add the following information:
info:
id: ddf--your_company--oil_per_person
author: your_company
version: v1
license: MIT
language: en
base:
- ddf--gapminder--population
- ddf--gapminder--geo_entity_domain
- ddf--bp--energy
2. Set Options¶
The next thing to do is to tell Chef where to look for the source datasets. We
can set this option in config
section.
config:
ddf_dir: /path/to/datasets/
There are other options you can set in this section, check config section for available options.
3. Define Ingredients¶
The basic object in recipe is ingredient. A ingredient defines a collection of
data which comes form existing dataset or result of computation between several
ingredients. To define ingredients form existing datasets or csv files, we
append to the ingredients
section; to define ingredients on the fly, we
append to the cooking
section.
The ingredients
section is a list of ingredient objects. An ingredient that
reads data from a data package should be defined with following parameters:
id
: the name of the ingredientdataset
: the dataset where the ingredient is fromkey
: the primary keys to filter from datapackage, should be comma seperated stringsvalue
: a list of concept names to filter from the result of filtering keys, or pass “*” to select all. Alternatively a mongodb-like query can be used.filter
: optional, only select rows match the filter. The filter is a dictionary where keys are colunm names and values are values to filter. Alternatively a mongodb-like query can be used.
There are more parameters for ingredient definition, see the ingredients section document.
In our example, we need datapoints from both gapminder population dataset and oil consumption datapoints from bp dataset. Noticing the bp is using lower case short names for its geo and gapminder is using 3 letter iso for its country entities, we should align them to use one system too. So we end up with below ingredients:
ingredients:
- id: oil-consumption-datapoints
dataset: ddf--bp--energy
key: geo, year
value:
- oil_consumption_tonnes
- id: population-datapoints
dataset: ddf--gapminder--population
key: country, year
value: "*" # note that the * symbol is reserved symbol in yaml,
# we should quote it if we mean a string
- id: bp-geo-entities
dataset: ddf--bp--energy
key: geo
value: "*"
- id: gapminder-country-synonyms
dataset: ddf--gapminder--population
key: country, synonym
value: "*"
4. Add Cooking Procedures¶
We have all ingredients we need, the next step is to cook with these
ingredients. In recipe we put all cooking procedures under the cooking
section. Because in DDF model we have 3 kinds of collections: concepts
,
datapoints
and entities
, we divide the cooking section into 3
corresponding sub-sections, and in each section will be a list of
procedures
. So the basic format is:
cooking:
concepts:
# procedures for concepts here
entities:
# procedures for entities here
datapoints:
# procedures for datapoints here
Procedures are like functions. They take ingredients as input, operate with options, and return new ingredients as result. For a complete list of supported procedures, see Available Procedures. With this in mind, we can start writing our cooking procedures. Suppose after some discussion, we decided our task list is:
- datapoints: oil consumption per capita, and use country/year as dimensions.
- entities: use the country entities from Gapminder
- concepts: all concepts from datapoints and entities
Firstly we look at datapoints. What we need to do to get what we need are:
- change the dimensions to country/year for bp and gapminder datapoints
- align bp datapoints to use gapminder’s country entities
- calculate per capita data
We can use translate_header, translate_column, merge, run_op to get these tasks done.
datapoints:
# change dimension for bp
- procedure: translate_header
ingredients:
- bp-datapoints
options:
dictionary:
geo: country
result: bp-datapoints-translated
# align bp geo to gapminder country
- procedure: translate_column
ingredients:
- bp-geo-entities
result: bp-geo-translated
options:
column: geo_name # the procedure will search for values in this column
target_column: country # ... and put the matched value in this column
dictionary:
base: gapminder-country-synonyms
# key is the columns to search for match of geo names
key: synonym
# value is the column to get new value
value: country
# align bp datapoints to new bp entities
- procedure: translate_column
ingredients:
- bp-datapoints-translated
result: bp-datapoints-translated-aligned
options:
column: country
target_column: country
dictionary:
base: bp-geo-translated
key: geo
value: country
# merge bp/gapminder data and calculate the result
- procedure: merge
ingredients:
- bp-datapoints-translated-aligned
- population-datapoints
result: bp-population-merged-datapoints
- procedure: run_op
ingredients:
- bp-population-merged-datapoints
option:
op:
oil_consumption_per_capita: |
oil_consumption_tonnes * 1000 / population
result: datapoints-calculated
# only keep the indicator we need
- procedure: filter
ingredients:
- datapoints-calculated
options:
item:
- oil_consumption_per_capita
result: datapoints-final
For entities, we will just use the country entities from gapminder, so we can skip this part. For concepts, we need to extract concepts from the ingredients:
concepts:
- procedure: extract_concepts
ingredients:
- datapoints-final
- gapminder-country-entities
result: concepts-final
options:
overwrite: # manually set some concept_types
year: time
country: entity_domain
5. Serve Dishes¶
After all these procedure, we have cook the dishes and it’s time to serve it! In
recipe we can set which ingredients are we going to serve(save to disk) in the
serving
section. Note that this section is optional, and if you don’t specify
then the last procedure of each sub-section of cooking
will be served.
serving:
- id: concepts-final
- id: gapminder-country-entities
- id: datapoints-final
Now we have finished the recipe. For the complete recipe, please check this gist.
6. Running the Recipe¶
To run the recipe to generate the dataset, we use the ddf command line tool. Run
the following command and it will cook for you and result will be saved into
out_dir
.
ddf run_recipe -i example.yml -o out_dir
If you want to just do a dry run without saving the result, you can run with the
-d
option.
ddf run_recipe -i example.yml -d
Now you have learned the basics of Recipe. We will go though more details in Recipe in the next sections.
Structure of a Recipe¶
A recipe is made of following parts:
- basic info
- configuration
- includes
- ingredients
- cooking procedures
- serving section
A recipe file can be in either json or yaml format. We will explain each part of recipe in details in the next sections.
info section¶
All basic info are stored in info
section of the recipe. an id
field is required inside this section. Any other information about the
new dataset can be store inside this section, such as name
,
provider
, description
and so on. Data in this section will be
written into datapackage.json file of the generated dataset.
config section¶
Inside config
section, we define the configuration of dirs.
currently we can set below path:
ddf_dir
: the directory that contains all ddf csv repos. Must set this variable in the main recipe to run with chef, or provide as an command line option using theddf
utility.recipes_dir
: the directory contains all recipes to include. Must set this variable if we haveinclude
section. If relative path is provided, the path will be related to the path of the recipe.dictionary_dir
: the directory contains all translation files. Must set this variable if we have json file in the options of procedures. (translation will be discussed later). If relative path is provided, the path will be related to the path of the recipe.procedures_dir
: when you want to use custom procedures, you should set this option to tell which dir the procedures are in.
include section¶
A recipe can include other recipes inside itself. to include a recipe,
simply append the filename to the include
section. note that it
should be a absolute path or a filename inside the recipes_dir
.
ingredients section¶
A recipe must have some ingredients for cooking. There are 2 places where we can define ingredients in recipe:
- in
ingredients
section - in the
ingredients
parameter in procedures, which is called on-the-fly ingredients
in either case, the format of ingredient definition object is the same. An ingredient should be defined with following parameters:
id
: the name of the ingredient, which we can refer later in the procedures.id
is optional when the ingredient is in a procedure object.dataset
ordata
: one of them should be defined in the ingredient. Usedataset
when we want to read data from an dataset, and usedata
when we want to read data from a csv file.key
: the primary keys to filter from datapackage, should be comma seperated stringsvalue
: optional, a list of concept names to filter from the result of filtering keys, or pass “*” to select all. Mongo-like queries are also supported, see examples below. If omitted, assume “*”.filter
: optional, only select rows match the filter. The filter is a dictionary where keys are colunm names and values are values to filter. Mongo-like queries are also supported, see examples below and examples infilter
procedure.
Here is an example ingredient object in recipe:
id: example-ingredient dataset: ddf--example--dataset key: "geo,time" # key columns of ingredient value: # only include concepts listed here - concept_1 - concept_2 filter: # select rows by column values geo: # only keep datapoint where `geo` is in [swe, usa, chn] - swe - usa - chn
value
and filter
can accept mongo like queries to make more
complex statements, for example:
id: example-ingredient
dataset: ddf--example--dataset
key: geo, time
value:
$nin: # exclude following indicators
- concept1
- concept2
filter:
geo:
$in:
- swe
- usa
- chn
year:
$and:
$gt: 2000
$lt: 2015
for now, value accepts $in
and $nin
keywords, but only one of
them can be in the value option; filter supports logical keywords:
$and
, $or
, $not
, $nor
, and comparision keywords:
$eq
, $gt
, $gte
, $lt
, $lte
, $ne
, $in
,
$nin
.
The other way to define the ingredient data is using the data
keyword to include external csv file, or inline the data in the
ingredient definition. Example:
id: example-ingredient
key: concept
data: external_concepts.csv
You can also create On-the-fly ingredient:
id: example-ingredient
key: concept
data:
- concept: concept_1
name: concept_name_1
concept_type: string
description: concept_description_1
- concept: concept_2
name: concept_name_2
concept_type: measure
description: concept_description_2
cooking section¶
cooking
section is a dictionary contains one or more list of
procedures to build a dataset. valid keys for cooking section are
datapoints, entities, concepts.
The basic format of a procedure is:
procedure: proc_name
ingredients:
- ingredient_to_run_the_proc
options: # options object to pass to the procedure
foo: baz
result: id_of_new_ingredient
Available procedures will be shown in the below section.
serving section and serve procedure¶
For now there are 2 ways to tell chef which ingredients should be served, and you can choose one of them, but not both.
serve procedure
serve
procedure should be placed in cooking
section, with the
following format:
procedure: serve
ingredients:
- ingredient_to_serve
options:
opt: val
multiple serve procedures are allowed in each cooking section.
serving section
serving
section should be a top level object in the recipe, with
following format:
serving:
- id: ingredient_to_serve_1
options:
opt: val
- id: ingredient_to_serve_2
options:
foo: baz
available options
digits
: int, controls how many decimal should be kept at most in a numeric ingredient.no_keep_sets
: bool, by default chef will serve the entities by entity_sets, i.e. each entity set will have one file. Enabling this will make chef serve entire domain in one file, no separated files
Recipe execution¶
To run a recipe, you can use the ddf run_recipe
command:
$ ddf run_recipe -i path_to_rsecipe.yaml -o output_dir
You can specify the path where your datasets are stored:
$ ddf run_recipe -i path_to_recipe.yaml -o output_dir --ddf_dir path_to_datasets
Internally, the process to generate a dataset have following steps:
- read the main recipe into Python object
- if there is include section, read each file in the include list and expand the main recipe
- if there is file name in dictionary option of each procedure, try to expand them if the option value is a filename
- check if all datasets are available
- build a procedure dependency tree, check if there are loops in it
- if there is no
serve
procedure andserving
section, the last procedure result for each section will be served. If there isserve
procedure orserving
section, chef will serve the result as described - run the procedures for each ingredient to be served and their dependencies
- save output to disk
If you want to embed the function into your script, you can write script like this:
import ddf_utils.chef as chef
def run_recipe(recipe_file, outdir):
recipe = chef.build_recipe(recipe_file) # get all sub-recipes and dictionaries
res = chef.run_recipe(recipe) # run the recipe, get output for serving
chef.dishes_to_disk(res) # save output to disk
run_recipe(path_to_recipe, outdir)
Available procedures¶
Currently supported procedures:
- translate_header: change ingredient data header according to a mapping dictionary
- translate_column: change column values of ingredient data according to a mapping dictionary
- merge: merge ingredients together on their keys
- groupby: group ingredient by columns and do aggregate/filter/transform
- window: run function on rolling windows
- filter: filter ingredient data with Mongo-like query
- filter_row: filter ingredient data by column values
- filter_item: filter ingredient data by concepts
- run_op: run math operations on ingredient columns
- extract_concepts: generate concepts ingredient from other ingredients
- trend_bridge: connect 2 ingredients and make custom smoothing
- flatten: flatten dimensions in the indicators to create new indicators
- split_entity: split an entity and create new entity from it
- merge_entity: merge some entity to create a new entity
translate_header¶
Change ingredient data header according to a mapping dictionary.
usage and options
procedure: translate_header
ingredients: # list of ingredient id
- ingredient_id
result: str # new ingledient id
options:
dictionary: str or dict # file name or mappings dictionary
notes
- if
dictionary
option is a dictionary, it should be a dictionary of oldname -> newname mappings; if it’s a string, the string should be a json file name that contains such dictionary. - currently chef only support one ingredient in the
ingredients
parameter
translate_column¶
Change column values of ingredient data according to a mapping dictionary, the dictionary can be generated from an other ingredient.
usage and options
procedure: translate_column
ingredients: # list of ingredient id
- ingredient_id
result: str # new ingledient id
options:
column: str # the column to be translated
target_column: str # optinoal, the target column to store the translated data
not_found: {'drop', 'include', 'error'} # optional, the behavior when there is values not found in the mapping dictionary, default is 'drop'
ambiguity: {'prompt', 'skip', 'error'} # optional, the behavior when there is ambiguity in the dictionary
dictionary: str or dict # file name or mappings dictionary
notes
- If
base
is provided indictionary
,key
andvalue
should also indictionary
. In this case chef will generate a mapping dictionary using thebase
ingredient. The dictionary format will be:
dictionary:
base: str # ingredient name
key: str or list # the columns to be the keys of the dictionary, can accept a list
value: str # the column to be the values of the the dictionary, must be one column
- currently chef only support one ingredient in the
ingredients
parameter
examples
here is an example when we translate the BP geo names into Gapminder’s
procedure: translate_column
ingredients:
- bp-geo
options:
column: name
target_column: geo_new
dictionary:
base: gw-countries
key: ['alternative_1', 'alternative_2', 'alternative_3',
'alternative_4_cdiac', 'pandg', 'god_id', 'alt_5', 'upper_case_name',
'iso3166_1_alpha2', 'iso3166_1_alpha3', 'arb1', 'arb2', 'arb3', 'arb4',
'arb5', 'arb6', 'name']
value: country
not_found: drop
result: geo-aligned
merge¶
Merge ingredients together on their keys.
usage and options
procedure: merge
ingredients: # list of ingredient id
- ingredient_id_1
- ingredient_id_2
- ingredient_id_3
# ...
result: str # new ingledient id
options:
deep: bool # use deep merge if true
notes
- The ingredients will be merged one by one in the order of how they are provided to this function. Later ones will overwrite the pervious merged results.
- deep merge is when we check every datapoint for existence if false, overwrite is on the file level. If key-value (e.g. geo,year-population_total) exists, whole file gets overwritten if true, overwrite is on the row level. If values (e.g. afr,2015-population_total) exists, it gets overwritten, if it doesn’t it stays
groupby¶
Group ingredient by columns and do aggregate/filter/transform.
usage and options
procedure: groupby
ingredients: # list of ingredient id
- ingredient_id
result: str # new ingledient id
options:
groupby: str or list # colunm(s) to group
aggregate: dict # function block
transform: dict # function block
filter: dict # function block
insert_key: dict # manually add columns
notes
- Only one of
aggregate
,transform
orfilter
can be used in one procedure. - Any columns not mentioned in groupby or functions are dropped.
- If you want to add back dropped columns with same values, use
insert_key
option. - Currently chef only support one ingredient in the
ingredients
parameter
function block
Two styles of function block are supported, and they can mix in one procedure:
aggregate: # or transform, filter
col1: sum # run sum to col1
col2: mean
col3: # run foo to col3 with param1=baz
function: foo
param1: baz
also, we can use wildcard in the column names:
aggregate: # or transform, filter
"population*": sum # run sum to all indicators starts with "population"
window¶
Run function on rolling windows.
usage and options
procedure: window
ingredients: # list of ingredient id
- ingredient_id
result: str # new ingledient id
options:
window:
column: str # column which window is created from
size: int or 'expanding' # if int then rolling window, if expanding then expanding window
min_periods: int # as in pandas
center: bool # as in pandas
aggregate: dict
function block
Two styles of function block are supported, and they can mix in one procedure:
aggregate:
col1: sum # run rolling sum to col1
col2: mean # run rolling mean to col2
col3: # run foo to col3 with param1=baz
function: foo
param1: baz
notes
- currently chef only support one ingredient in the
ingredients
parameter
filter¶
Filter ingredient data with Mongo-like queries. You can filter the ingredient by item, which means indicators in datapoints or columns in other type of ingredients, and/or by row.
item
filter accepts a list of items, or a list followed by $in
or $nin
. row
filter accepts a query similar to mongo queries,
supportted keywords are $and
, $or
, $eq
, $ne
, $gt
,
$lt
. See below for an example.
usage and options:
- procedure: filter
ingredients:
- ingredient_id
options:
item: # just as `value` in ingredient definition
$in:
- concept_1
- concept_2
row: # just as `filter` in ingredient definition
$and:
geo:
$ne: usa
year:
$gt: 2010
result: output_ingredient
for more information, see the
ddf_utils.chef.model.ingredient.Ingredient
class and
ddf_utils.chef.procedure.filter()
function.
run_op¶
Run math operations on ingredient columns.
usage and options
procedure: run_op
ingredients: # list of ingredient id
- ingredient_id
result: str # new ingledient id
options:
op: dict # column name -> calculation mappings
notes
- currently chef only support one ingredient in the
ingredients
parameter
Examples
for exmaple, if we want to add 2 columns, col_a
and col_b
, to
create an new column, we can write
procedure: run_op
ingredients:
- ingredient_to_run
result: new_ingredient_id
options:
op:
new_col_name: "col_a + col_b"
extract_concepts¶
Generate concepts ingredient from other ingredients.
usage and options
procedure: extract_concepts
ingredients: # list of ingredient id
- ingredient_id_1
- ingredient_id_2
result: str # new ingledient id
options:
join: # optional
base: str # base concept ingredient id
type: {'full_outer', 'ingredients_outer'} # default is full_outer
include_keys: true # if we should include the primaryKeys of the ingredients
overwrite: # overwirte some of the concept types
year: time
notes
- all concepts in ingredients in the
ingredients
parameter will be extracted to a new concept ingredient join
option is optional; if present then thebase
will merge with concepts fromingredients
full_outer
join means get the union of concepts;ingredients_outer
means only keep concepts fromingredients
trend_bridge¶
Connect 2 ingredients and make custom smoothing.
usage and options
- procedure: trend_bridge
ingredients:
- data_ingredient # optional, if not set defaults to empty ingredient
options:
bridge_start:
ingredient: old_data_ingredient # optional, if not set then assume it's the input ingredient
column: concept_old_data
bridge_end:
ingredient: new_data_ingredient # optional, if not set then assume it's the input ingredient
column: concept_new_data
bridge_length: 5 # steps in time. If year, years, if days, days.
bridge_on: time # the index column to build the bridge with
target_column: concept_in_result # overwrites if exists. creates if not exists.
# defaults to bridge_end.column
result: data_bridged
flatten¶
Flatten dimension to create new indicators.
This procedure only applies for datapoints ingredients.
usage and options
- procedure: flatten
ingredients:
- data_ingredient
options:
flatten_dimensions: # a list of dimensions to be flattened
- entity_1
- entity_2
dictionary: # old name -> new name mappings, supports wildcard and template.
"old_name_wildcard": "new_name_{entity_1}_{entity_2}"
example
For example, if we have datapoints for population by gender, year, country. And gender entity domain
has male
and female
entity. And we want to create 2 seperated indicators: population_male
and population_female
. The procedure should be:
- procedure: flatten
ingredients:
- population_by_gender_ingredient
options:
flatten_dimensions:
- gender
dictionary:
"population": "{concept}_{gender}" # concept will be mapped to the concept name being flattened
split_entity¶
(WIP) split an entity into several entities
merge_entity¶
(WIP) merge several entities into one new entity
custom procedures¶
You can also load your own procedures. The procedure name should be
module.function
, where module
should be in the procedures_dir
or
other paths in sys.path
.
The procedure should be defined as following structure:
from ddf_utils.chef.model.chef import Chef
from ddf_utils.chef.model.ingredient import ProcedureResult
from ddf_utils.chef.helpers import debuggable
@debuggable # adding debug option to the procedure
def custom_procedure(chef, ingredients, result, **options):
# you must have chef(a Chef object), ingredients (a list of string),
# result (a string) as parameters
#
# procedures...
# and finally return a ProcedureResult object
return ProcedureResult(chef, result, primarykey, data)
Check our predefined procedures for examples.
Checking Intermediate Results¶
Most of the procedures supports debug
option, which will save the result
ingredient to _debug/<ingredient_id>/
folder of your working directory. So
if you want to check the intermediate results, just add debug: true
to the
options
dictionary.
Validate the Result with ddf-validation¶
After generating the dataset, it would be good to check if the output dataset is valid against the DDF CSV model. There is a tool ddf-validation for that, which is written in nodejs.
to check if a dataset is valid, install ddf-validation and run:
cd path_to_your_dataset
validate-ddf
Validate Recipe with Schema¶
In ddf_utils we provided a command for recipe writers to check if the recipe is valid using a JSON schema for recipe. The following command check and report any errors in recipe:
$ ddf validate_recipe input.yaml
Note that if you have includes in your recipe, you may want to build a complete recipe before validating it. You can firstly build your recipe and validate it:
$ ddf build_recipe input.yaml > output.json
$ ddf validate_recipe output.json
or just run ddf validate_recipe --build input.yaml
without creating a new
file.
The validate command will output the json paths that are invalid, so that you can easily check which part of your recipe is wrong. For example,
$ ddf validate_recipe --build etl.yml
On .cooking.datapoints[7]
{'procedure': 'translate_header', 'ingredients': ['unpop-datapoints-pop-by-age-aligned'], 'options': {'dictionary_f': {'country_code': 'country'}}, 'result': 'unpop-datapoints-pop-by-age-country'} is not valid under any of the given schemas
For a pretty printed output of the invalid path, try using json processors like jq:
// $ ddf build_recipe etl.yml | jq ".cooking.datapoints[7]"
{
"procedure": "translate_header",
"ingredients": [
"unpop-datapoints-pop-by-age-aligned"
],
"options": {
"dictionary_f": {
"country_code": "country"
}
},
"result": "unpop-datapoints-pop-by-age-country"
}
Other then the json schema, we can also valiate recipe using dhall
, as we will talk about in next section.
Write recipe in Dhall¶
Sometimes there will be recurring tasks, for example, you might applying same procedures again and again to different ingredients. In this case we would benefit from Dhall language. Also, there are more advantages on using Dhall over yaml, such as type checking. We provide type definitions for the recipe in an other repo. Check examples in the repo to see how to use them.
General guidelines for writing recipes¶
- if you need to use
translate_header
/translate_column
in your recipe, place them at the beginning of recipe. This can improve the performance of running the recipe. - run recipe with
ddf --debug run_recipe
will enable debug output when running recipes. use it with thedebug
option will help you in the development of recipes.
The Hy Mode¶
From Hy’s home page:
Hy is a wonderful dialect of Lisp that’s embedded in Python.
Since Hy transforms its Lisp code into the Python Abstract Syntax Tree, you have the whole beautiful world of Python at your fingertips, in Lisp form!
Okay, if you’re still with me, then let’s dive into the world of Hy recipe!
We provided some macros for writing recipes. They are similar to the sections in YAML:
;; import all macros
(require [ddf_utils.chef.hy_mod.macros [*]])
;; you should call init macro at the beginning.
;; This will initial a global variable *chef*
(init)
;; info macro, just like the info section in YAML
(info :id "my_fancy_dataset"
:date "2017-12-01")
;; config macro, just like the config section in YAML
(config :ddf_dir "path_to_ddf_dir"
:dictionary_dir "path_to_dict_dir")
;; ingredient macro, each one defines an ingredient. Just like a
;; list element in ingredients section in YAML
(ingredient :id "datapoints-source"
:dataset "source_dataset"
:key "geo, year")
;; procedure macro, each one defines a procedure. Just like an element
;; in the cooking blocks. First 2 parameters are the result id and the
;; collection it's in.
(procedure "result-ingredient" "datapoints"
:procedure "translate_header"
:ingredients ["datapoints-source"]
:options {:dictionary
{"geo" "country"}}) ;; it doesn't matter if you use keyword or plain string
;; for the options dictionary's key
;; serve macro
(serve :ingredients ["result-ingredient"]
:options {"digits" 2})
;; run the recipe
(run)
;; you can do anything to the global chef element
(print (*chef*.to_recipe))
There are more examples in the example folder.