Results

Expected results of the project:

Technologies for fine-tuning large language models capable of integrating long-term general knowledge and inferring meaning.
IfGPT will improve freely available large language models and chat models in terms of reflecting the Bulgarian language and culture, taking into account the context and reliability of the information. The result is aimed at business representatives, the academic community and the general public.

Large Language Models
Technologies for the collection of clean, non-toxic data that is free of duplicates and personally identifiable information.
IfGPT will improve technologies for extracting clean data without content corruption and compiling big data without content duplication. It will expand methods for selecting non-toxic data, as well as for data depersonalisation. The result is aimed at representatives of the business and academic communities.

IfGPT dataset
Adoption of innovative language technology solutions that provide artificial intelligence applications and services for businesses in Bulgaria.
IfGPT will create an Infrastructure for fine-tuning large language models in Bulgarian, as well as in relation to specific domains or tasks. The infrastructure will include tools for creating high-quality datasets and for testing and evaluating the fine-tuning of large language models. The output is aimed at representatives of the business and academic community.

Documentation