Overview
Many companies are experimenting with generative AI and Large Language Models (LLMs) or developing new services based on them, particularly in marketing, communication, and increasingly in market analysis. However, as shown by a literature review and discussions with industry experts, a consistent and comprehensive quality assessment of the results is lacking.
This project aims to develop and test an easy-to-apply, interdisciplinary LLM evaluation framework based on state-of-the-art methods, exchange with practitioners and industry experts, and academic literature. Within the proposed project, a Minimum Viable Concept (MVC) of the framework will be developed and tested using a dataset, with the goal of a follow-up project in which the framework can be further expanded, refined, and validated.