Home Artists Posts Import Register

Downloads

Content

Here is the public Github repo for the project: https://github.com/echohive42/GPT-adversarial-defens

youtube video: https://youtu.be/I5nh1FlFQOY

I have created a real time defense for adversarial prompt attacks for GPT

It uses functions calling attempting to generate the first 30 words of a  regular response then returns "True" or "False" for objectionable content based on the initial words of the response.

regular response is generated simultaneously with the objectionable response detection but only printed if there is no objectionable response

It uses concurrency to manage both generations simultaneously

This is an effort to save on token usage while trying to detect of GPT response is objectionable as this implementation only uses the first 30 words of a response for detection and in real time.

(not tested thoroughly by any means)

Please give it a star if you think the idea is worthy

Files

Comments

No comments found for this post.