Hi I want to create pickaxe, where other users will enter youtube URL, the tool will bring transcription + summary . Is this something possible and how to do it ?
I think it may be possible to use the actions integrator if you can find the right api to do the transcription. Maybe try Assembly.ai?
This is actually possible, because youtube exposes the transcripts of their videos via api, and they transcribe all videos for you (check out here).
We actually do this process for you. Currently, if a user sends a youtube link, we will actually grab the transcript. However, currently, it takes a bit of time to process.
So I would recommend collecting the youtube link in one message, and then having the user answer another question before attempting to generate the summary.
@nathanielmhld , can you elaborate the youtube workflow. You mean user will send URL in one message; the tool responds with the transcript , then in another message the user asks for summary ?
It will not immediately have the information. User can send link, bot can respond with something like, link received, would you like a transcript & summary?
By the time user responds yes, it should be ok.
i created a tool, it would just transcribe few sentences from video and dub every sentence as different speaker…but not all… and provide a few lines of summary. I left the knowlege base empty. How can I improve my tool?
I used the following as role
Persona:
You are an efficient and accurate transcriber and summarizer AI. Your main task is to transcribe and summarize YouTube videos. You’re known for your speed and accuracy.
Task:
- Take the YouTube URL provided by the user
- Transcribe the video’s audio into written text
- Summarize the video’s main points into a concise summary
Rules:
- Make sure the transcription and summary are accurate
- The summary should be concise and should capture the key points
- Respect the tone and context of the video
@nathanielmhld any thoughts
I’m not sure you need to define transcriber as part of the role. Basically – Pickaxe is already facilitating a transcript of YouTube links. So the role is to summarize YouTube videos using the built in transcriber.
I don’t think there is any way to verify accuracy because its just using the existing transcript that Google created for the YouTube video.
Hey There, I’m trying to do the same thing and whenever I ask the bot for a summary of the youtube link the output is something like this - “I’m currently unable to directly access and transcribe the content of the YouTube video you provided. However, I can guide you on how to obtain a transcription and summary using available tools.”
It looks like Google has nerfed this functionality for many of the existing solutions on the market. I did find one that’s still working, but it doesn’t seem to provide an API.
Also testing to see if Assembly.AI still works for this need or if they nerfed them too.
Which LLM are you using for this ?
This is what I get as a response (using Claude Sonnet)
I apologize, but I don’t have the capability to directly access, watch, or transcribe YouTube videos. As an AI language model, I don’t have the ability to browse the internet or process audio-visual content.
However, I can provide you with a general approach on how to summarize a YouTube video:
- Watch the video carefully, taking notes of key points.
- Identify the main topic or theme of the video.
- List the major arguments or ideas presented.
- Note any important facts, statistics, or examples used.
- Capture the conclusion or main takeaway of the video.
- Synthesize this information into a concise summary, typically a few paragraphs long.
If you’d like to discuss the content of this specific video after you’ve watched it, I’d be happy to help you summarize or analyze its content based on the information you provide.
Seems like there is no good solution for youtube summarization
can please also suggest a best one for Nord.Js?
The Pickaxe system basically treats youtube links as if they are document uploads and someone is uploading a video transcript into the chatbot.
Make sure you expand the maximum input length to a very high number. You can find these settings under configure>Token Lengths.
The youtube video transcript will be dumped into the context window of the chatbot as long as it fits into the size of the max user input. This help post explains more.
There’s another way to do it via a make webhook, then an HTTP module (within a broader workflow). The HTTP module can be configured to pull in data from Google’s Youtube API.
You can get via https://console.cloud.google.com/
Here’s s tutorial video on how to obtain a YouTube API key from Google: