GPU Poor POV: Low Hanging Fruits
Sometimes we had to work with different language than English (what a surprise!) and it can be problematic, because as you may know many algorithms are mainly developed in English.
I was involved in building RAG in Polish language. At first, we need an proper embeddings for Polish language to feed them into lightweight LLM. Looking through possible solution I become aware that existing/possible models are not accurate enough, and worked much worse than its 'english equivalent'.
First thing that comes to mind is:
Lets become a mad scientist, download all possible data and train model for months to get the proper one.
But there are few cons of this.
- Its computionally heavy
- You are not full time researcher
- you have potential clients who want to use your solution, and they really happy to use it (in optimistic mood).
Here comes the low hanging fruits. We developed a easier, workable solution. Instead of training new SOTA, we can use translation module like this one:
https://huggingface.co/Helsinki-NLP/opus-mt-pl-en
translate your knowledge base to english, and use proper embedding model accurately. I converted existing model using ctranslate2,
ct2-transformers-converter --model Helsinki-NLP/opus-mt-pl-en --output_dir opus-mt-pl-en
so making an inference in real time is almost invisible for front user (we observe 5 times speedup in compare to original version).
And by indexing knowledge base, we can return answer to LLM in any language. (Indexes of context found in english language are equal to indexes in native language knowledge base).
Of course there are some tweaks required, we have to validate the accuracy of the translation. And accuracy was high enough to use it.
So it was a nice journey, we have our work done in some relatively easy way, there are people who can use it, so additive value is creating which is nice. Have a great day and I wish you more effective deploys! <3