[ad_1]
Lack of knowledge about whether training data can be trusted is problematic, but this is multiplied when you consider how AIs work and how they ‘learn’. LLMs use various sources, including news media, academic papers, books and Wikipedia. They work by training on vast amounts of text data to learn patterns and associations between words, allowing them to understand and generate coherent and contextually relevant language based on the input it
receives. They can answer questions on anything from how to build a website to how to treat a kidney infection. The assumption is that such advice or answers will become better and more nuanced over time as the AI learns, technology
advances and more data is used for training. However, if the data feeding the generative AI exaggerates certain features – and minimises others – of the data, existing prejudices and biases will be increasingly amplified.
Additionally, if the data lacks specific domains or diverse perspectives, the model may exhibit a limited understanding of certain topics, further contributing to its collapse.
[ad_2]
Source link