Define extraction strategy schema typings #543
Replies: 1 comment
-
|
@arnm To be honest, this is very beautiful. I agree with you - tbh even myself I have to check the code to remember. Would you like to create a pull request for the current version and we can continue the discussion there? I really like what you've done here. Also, feel free to join our Discord channels if you'd like - just send me your email address and we can continue the conversation, test this and potentially add it to the next release. I actually have a plan to create two things for this extraction strategy. One is to automate schema generation using a language model that analyzes web pages based on what you're looking for (e.g. I want all agencies contact details). This would make the process automated, and pain-less! Another plan in the roadmap is a Chrome extension that lets users choose what they want and get the schema on the fly. Perhaps thats is one of the reason why I didn't invest more time on current version. If you're interested, you can join and help handle this. Let me know. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Currently, the extraction strategies schemas are type
Dict[str, Any]which requires devs to look at the source code of the extraction strategy to see which values are expected and then try to figure out what they do. The documentation on this is still lacking and does not even mention everything thatJsonCssExtrationStrategycan do for example..I've generated the following types to help with my use of
JsonCssExtractionStrategyand I would like to see types be used to convey expected input.Example usage:
Beta Was this translation helpful? Give feedback.
All reactions