Home Artists Posts Import Register

Downloads

Content

NOTE: This doesn't include the extracted code dataset and the Vectordb which takes about $75 in API costs to create. You can find  all the data here: https://www.patreon.com/posts/87972378
This still includes the original downloaded Langchain documentation data.

UPDATE: Uploaded new requirements.txt which has all the packages

This is for video: https://youtu.be/MbL0iLLK20o

NOTE: Running the extract_code_parallel.py file will cost $75 in API costs!!!

Crawled documentation is 1.5 million tokens😯

Search 170+ echohive videos and code download links: https://www.echohive.live/

Chat with us on Discord:  https://discord.gg/PPxTP3Cs3G

LLM Paper Summaries: https://llmpapers.up.railway.app/

Source code for AUTO AGI: https://www.patreon.com/posts/code-files-for-87530987

Files

Comments

echohive42

If you run extract code paralel file until completion and embed the resulting files then yes. Other than the minor variations GPT-4 may produce. You can limit at first, the list which the threadpool executor is working with, to test the waters and see the process yourself. I did it in 100 chunks at first [:100] then [100:200] etc. It ran for me without errors. Hopefully, it will be the same for you.

Oscar Agreda

the requirements should be beautifulsoup4 #langchain[all] you may consider installing them all numpy openai pandas requests tenacity termcolor tiktoken matplotlib plotly scipy scikit-learn