Home Artists Posts Import Register

Downloads

Content

Chat with and Summarize PDF documents with Langchain and OpenAI video files:

This is for video: https://youtu.be/JJ6ATxp42cQ

Comments

Adolfo Rodriguez

Would it be possible to estimate the cost before doing the embeddings in?

Network Technician

Hi, I am getting this error when I try to install the requirements. Using conda. Dyou know what is the issue here? creating build\temp.win-amd64-cpython-311\Release\src\sentencepiece "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\mikew\AppData\Local\Programs\Python\Python311\include -IC:\Users\mikew\AppData\Local\Programs\Python\Python311\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" /EHsc /Tpsrc/sentencepiece/sentencepiece_wrap.cxx /Fobuild\temp.win-amd64-cpython-311\Release\src/sentencepiece/sentencepiece_wrap.obj /std:c++17 /MT /I..\build\root\include cl : Command line warning D9025 : overriding '/MD' with '/MT' sentencepiece_wrap.cxx src/sentencepiece/sentencepiece_wrap.cxx(2822): fatal error C1083: Cannot open include file: 'sentencepiece_processor.h': No such file or directory error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.35.32215\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2 [end of output]

echohive42

Yes It would be using tiktoken to count the tokens then doing a quick math on it with embeddings cost per 1k tokens.

echohive42

I never saw this error before. Not sure how to approach it. Please do a google search for it. Also try to install the requirements one by one using “pip install “package” —upgrade” use double dashes before the upgrade word.

Chuck Williams

How difficult would it be to put a Streamlit UI and file uploader to this script..similar to your other video where there is a Streamlit UI with the file uploader....thanks a ton.. Also, can gpt4 answer questions across PDFs..like get answers from multiple files....thanks a ton

echohive42

It shouldn't be too difficult. I tried an didn't get it working the way I wanted but I was short on time. gpt-4 can sure, you just need to change the model name. But be mindful of the cost.

Chuck Williams

Also ..can I only use the model 3.5 turbo...can I use gpt-4..do I just update this code def __init__(self, model_name="gpt-3.5-turbo", temperature=0)" todef __init__(self, model_name="gpt-4", temperature=0)

Chuck Williams

also..I put in my openai key in this section open.api.key = " " I put in my openain key and keep getting this error..NameError:name 'openai' is not defined.....

Chuck Williams

I have installed openai...I ran pip install openai and I am gettingthe openai is already installed....what I might be missing..thanks a ton

Chuck Williams

so I updated the code to import openai and now getting this error: Traceback (most recent call last): File "/home/chuckwilliams11/chat-financials/main.py", line 131, in chat = Chat_With_PDFs_and_Summarize() File "/home/chuckwilliams11/chat-financials/main.py", line 33, in __init__ self.llm_summarize = ChatOpenAI(model_name=model_name, temperature=temperature) File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for ChatOpenAI __root__ Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error) I do have my openai.api_key = to my actual api key inside ..not sure what is causing the error. and I also ran the pip install requirements.txt --upgrade command as well to make sure libraries are updated...

echohive42

You need to remove the line which checks for the key from the environment variables if there is a line for that You need to input the key as a string such as openai.api_key = "827364g....." in between quotation marks. Also make sure to save the file before running it when you update the code.

Chuck Williams

I did that...set this variable openai.api_key="sk-xxxxxxxxxxxxxxxxxxxxxx" and still getting the same .......which line I should look for that checks for the key from the environment variables?

Chuck Williams

I checked the code and don't see anything where the os library is checking or setting any environment variable...

echohive42

If there isn’t any that is fine. in the absence of a line of code which explicitly defines it then the environment key kicks in automatically.

echohive42

If you also made sure to save the file after updating the line of code with your api key. It should work. I am not sure why this error is happening.

Chuck Williams

Yes..this is strange...I set the openai.api_key variable to my actual openai api key...I didn't set an environment variable....I will comment out the open.api_key variable and try to set it as an environment variable..

Chuck Williams

I set the environment variable for the OPENAI_API_KEY and then in the code I added this openai.api_key = os.environ["OPENAI_API_KEY"] for the OS library to pull it....and the script worked....thanks a ton....I am planning to see if I can make this into a Streamlit app....but use the DirectoryLoader from Langchain to auto load all the pdfs by default. I am going to try to follow your other videos to try to figure this out.....thanks a ton.

echohive42

I am happy to hear that you got it working! Build some apps and share them at the discord too :)

matari

How would i go about editing the script to handle json, csv and txt files?

echohive42

You would have to check the file extension and load them accordingly. You would also have to split them in a way you wish as well. Pdf files come page by page but regular txt files don’t you can split by so many words. Or by paragraphs for example.

SHUBHAM NAGAR

Should you open source this, let me know I can contribute

echohive42

might I add that you can open source it yourself as well. Just remember the good ol' echo when you do

matari

have you checked out databutton.com it seems like it would streamline the process sharing your streamlit scripts directly to patreon supporters and having them use experiment with them with less friction.

Tuan Tran

why didn't you get this error: "Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.._completion_with_retry in 1.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-n2em2hU6UImYoI0iahrW8K2k on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.." ?

echohive42

Rate limit error happens if you are making too many requests to the api in a given time frame. But most of the time this error is because the OpenAI API is overloaded. If you are not making too many requests with your code then trying again in a while usually solves that problem.

echohive42

Make sure you are not running an infinite loop in your code or otherwise making too many requests by mistake.

Tuan Tran

no I just use your code and your pdf, not modified. Where could I add "delay 30s" in your code?

Ilias Mokas

Hi echohive and thank you for the detailed explanation. I am trying to run the code but i am getting this error. openai.error.InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again. When i set the api key because it is not personal i use the following: os.environ["OPENAI_API_KEY"] = "ABC" openai.api_key = "ABC" openai.api_type = "azure" openai.api_version = "XXX" openai.api_base = "XXXXX" Any ideas where can i look at to solve that issue? I would very much appreciate your help!

echohive42

I just ran the code and it worked fine on my end. This error means that you are attempting to call a model which doesn't exist. Since we are calling the chat model, this might happen if your openai library and langchain isn't up to date. try pip installing these versions of both: openai==0.27.2 langchain==0.0.135 with these versions on my machine the code is working without any issues. Hopefully this helps. Another thing to make sure that if you are defining the models in your code explicitly then make sure the naming of the model in your code is correct.

Ilias Mokas

Thank you for quick reply. I tried this but still the same issue. I think it is because i am trying to use Azure. Maybe it needs AzureChatOpenAI? I tried the following: class Chat_With_PDFs_and_Summarize: def __init__(self, model_name="text-davinci-003", temperature=0): # initialize AzureChatOpenAI for summarization and chat self.llm_summarize = AzureChatOpenAI( model_name=model_name, deployment_name="gpt-langchain", openai_api_type="azure", openai_api_base=os.environ["OPENAI_API_BASE"], openai_api_version=os.environ["OPENAI_API_VERSION"], openai_api_key=os.environ["OPENAI_API_KEY"], temperature=temperature, ) self.llm_chat = AzureChatOpenAI( model_name=model_name, deployment_name="gpt-langchain", openai_api_type="azure", openai_api_base=os.environ["OPENAI_API_BASE"], openai_api_version=os.environ["OPENAI_API_VERSION"], openai_api_key=os.environ["OPENAI_API_KEY"], temperature=temperature, ) But still getting some errors such as: raise error.InvalidRequestError( TypeError: InvalidRequestError.__init__() missing 1 required positional argument: 'param'

echohive42

hmm. I never used Azure so I cant speak to it. but you can search for this error with google and also at the issues at azure's openai github repo

Ilias Mokas

cool! thank you for your answer :)