Present Enbridge implementation - https://mqidentity-my.sharepoint.com/:w:/g/personal/nkhadri_atomiton_com/ERo7AJWT0rNIv78t_72XPNwBBAANUTov0WtpikAcFRcvaQ?e=kOP3oE
7th May 2021
Config files:
present ini: Using Python builtin
configparser
module - without nested section and list… I have written lot of code to overcome these limitationsConfigObj - third party: very similar to present ini, with minor syntax. This has nested and list features. Easy to generalize the config file.
JSON, XML - known to developer fraternity, but may not except layman. As config could be edited from operation and customer itself(not now, but sometime later)
Hydra.cc from Facebook - takes YAML as input bit has lot of features. With logging, debugging, and even it can change the config in command line.. It's way too powerful, the question is do we even need these complexities?
Config parser eg:
keyword1 = value1 keyword2 = value2 [section 1] # with value as list keyword1 = value1, value2, value3 keyword2 = value2 # Sub section [[sub-section]] # this is in section 1 keyword1 = value1 keyword2 = value2 [[[nested section]]] # this is in sub section keyword1 = value1 keyword2 = value2 [[sub-section2]] # this is in section 1 again keyword1 = value1 keyword2 = value2 [section 2] keyword1 = value1 keyword2 = value2
So easy to access the values.
Generalize command-line argument, We can have a class, and function which can abstract the variables passing value, help, default values.. But the limitation is arguments would be passed in a dictionaly.
6th May 2021:
After defining the reusable functions, what else is needed for DAS users to implement feature engineering.
DAS user:
To get the Config values, is “.ini” the right format? - how to make config format standardize in reading and parsing
Passing the appropriate values to the corresponding function
Stich the previous batch data with this batch
Save intermediate data in CSV and png.
Read CSV with index_name, date format,
Way to get the command line arguments
Solution:
Configs can be in a different format, need not be a standard.
To carry forward episode duration from the previous batch to the present batch, we need some code to get them in the format which is required for the defined format.
As we need previous batch data, to take average of 3 mins, we need to change the API of SVC, DIV, RVF.
Intermediate files to be saved should be included in API.
read_file (fileName, index_col, isIndexColDate=False, removeDulplicateIndexWithKeep=None)
CLASS
ArgumentClass(varName, help=None, defaultVal=None)
Eg:
arg1 = ArgClass("a")
arg2 = ArgClass("--b","help bb")
arg3 = ArgClass("--c")Function:
argumentDefinition(ArgumentClass objects....)
Eg:
argumentDefinition(arg1,arg2,arg3)return dictionary with key od variable name and value passed in command line argument