Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds several new transformation classes to the dataframe module, expanding the available data preprocessing capabilities. The changes introduce utility transformations for common data manipulation tasks like shuffling, type conversion, value replacement, and statistical operations.
Key changes include:
- Added 9 new transformation classes extending TransformBase
- Introduced loguru logger for transformation tracking
- Added utility classes for data preprocessing workflows
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| Default is to overwrite original dt_column | ||
| """ | ||
|
|
||
| def __init__(self, column_name: str, target_column: Optional[str] = None): |
There was a problem hiding this comment.
The type hint Optional[str] is used but Optional is not imported. You need to add Optional to the imports from typing.
|
|
||
| :param lambda_filter: a callable that specifies which row to filter | ||
| :param column_to_replace: which column to replace values with | ||
| :param replacement_value: the value to replace with |
There was a problem hiding this comment.
The docstring example shows ReplaceValues but this is the AddColumnWithCondition class. The example should demonstrate AddColumnWithCondition usage instead.
| :param replacement_value: the value to replace with | |
| Example: | |
| ```python | |
| # Suppose you want to add a column "is_bad" that is True if "indicator_column" == "bad_value" | |
| lambda_compute = lambda x: x["indicator_column"] == "bad_value" | |
| add_col = AddColumnWithCondition( | |
| lambda_compute=lambda_compute, | |
| target_column="is_bad" | |
| ) | |
| df = add_col(df) | |
| ``` | |
| :param lambda_compute: a callable that computes the value for each row (applied with DataFrame.apply, axis=1) | |
| :param target_column: the name of the column to add or overwrite |
| ) | ||
| ``` | ||
|
|
||
| :param lambda_filter: a callable that specifies which row to filter |
There was a problem hiding this comment.
The parameter documentation refers to lambda_filter but the actual parameter name is lambda_compute. The documentation should match the parameter name.
| :param lambda_filter: a callable that specifies which row to filter | |
| lambda_compute = lambda x: x["indicator_column"] == "bad_value" | |
| replace_val = ReplaceValues( | |
| lambda_compute = lambda_compute, | |
| column_to_replace = "value_a_column", | |
| replacement_value = np.nan | |
| ) | |
| ``` | |
| :param lambda_compute: a callable that specifies which row to filter |
| ``` | ||
|
|
||
| :param lambda_filter: a callable that specifies which row to filter | ||
| :param column_to_replace: which column to replace values with |
There was a problem hiding this comment.
The parameter documentation refers to column_to_replace but the actual parameter name is target_column. The documentation should match the parameter name.
| :param column_to_replace: which column to replace values with | |
| :param target_column: which column to replace values with |
|
|
||
| :param lambda_filter: a callable that specifies which row to filter | ||
| :param column_to_replace: which column to replace values with | ||
| :param replacement_value: the value to replace with |
There was a problem hiding this comment.
The parameter documentation refers to replacement_value but this parameter doesn't exist in the AddColumnWithCondition class. This documentation appears to be copied from another class.
| :param replacement_value: the value to replace with | |
| """Add a calculated column based on a lambda function. | |
| Example: | |
| ```python | |
| # Adds a new column 'is_bad' based on a condition | |
| lambda_compute = lambda x: x["indicator_column"] == "bad_value" | |
| add_col = AddColumnWithCondition( | |
| lambda_compute=lambda_compute, | |
| target_column="is_bad" | |
| ) | |
| ``` | |
| :param lambda_compute: a callable that computes the value for each row | |
| :param target_column: the name of the column to add or overwrite |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.