A popular AI chatbot app from China, DeepSeek, is certainly causing a stir in the tech world. Downloads at the App Store soared to a new high but not everything that glitters is always gold.
Shocking new results from an audit by NewsGuard shows that the chatbot failed 83% of its accuracy tests. Moreover, it was also seen promoting a host of Beijing’s policy direction which again makes the platform questionable.
The report highlights how it could not provide reliable data about current affairs or any kind of informational topic 83% of the time. This ranks it 10 out of 11 in the West as far as arch-rivals are concerned.
30% of all the replies it gave featured fake information. Similarly, 53% of the responses were non-answers to question prompts. 17% of the replies were involved in debunking false claims which are again very low. Lastly, it had a performance that was below average when looking at other similar competitors in the industry with a failure rate of 62%.
The app showed a similar pattern of responses where you’d find it inserting positions about the Chinese government frequently for replies. This is true for prompts that had nothing to do with China or its leadership. For instance, one example had the user asking about Syria but getting a reply about how China adhered to principles of non-interference in the matters of others.
Other points worth mentioning from the audit report are how many tech limitations it had including serious knowledge gaps. Most replies kept on hinting at how it was trained using data from the end of 2023 and nothing more advanced than that. This is why it showed limitations in terms of providing data for current affairs.
DeepSeek was most vulnerable to spreading misinformation when responding to malign actor prompts, with 8 out of 9 false claims stemming from such interactions.
This just goes to show how it and other tools could easily get weaponized by threat actors spreading misinformation on a large scale. It’s all very interesting as the assessment comes at a time when the AI race keeps heating up between America and China. The chatbot’s Terms of Use say that many users need to proactively confirm its authenticity and reliability to stop the spread of fake data.
It’s a policy that experts dub to be hands-off as it transfers the burden of evidence from developers to end users. For now, the China-based open-source AI platform failed to reply to any requests for comments on these audit findings.
DeepSeek will now be included in NewsGuard’s monthly AI audits, with its results anonymized alongside other chatbots to track industry-wide trends.
This is clear evidence that while DeepSeek might be making a mark in the world of marketing, it’s got a long way to go. With a failure rate so high, indeed, it cannot be depended upon for reliable data. For this reason, users need to double-check information before relying on any chatbot.
Image: DIW-Aigen
Read next: AI Tools Can Enhance Creativity, But Human Input Still Crucial for Copyright Protection
Shocking new results from an audit by NewsGuard shows that the chatbot failed 83% of its accuracy tests. Moreover, it was also seen promoting a host of Beijing’s policy direction which again makes the platform questionable.
The report highlights how it could not provide reliable data about current affairs or any kind of informational topic 83% of the time. This ranks it 10 out of 11 in the West as far as arch-rivals are concerned.
30% of all the replies it gave featured fake information. Similarly, 53% of the responses were non-answers to question prompts. 17% of the replies were involved in debunking false claims which are again very low. Lastly, it had a performance that was below average when looking at other similar competitors in the industry with a failure rate of 62%.
The app showed a similar pattern of responses where you’d find it inserting positions about the Chinese government frequently for replies. This is true for prompts that had nothing to do with China or its leadership. For instance, one example had the user asking about Syria but getting a reply about how China adhered to principles of non-interference in the matters of others.
Other points worth mentioning from the audit report are how many tech limitations it had including serious knowledge gaps. Most replies kept on hinting at how it was trained using data from the end of 2023 and nothing more advanced than that. This is why it showed limitations in terms of providing data for current affairs.
DeepSeek was most vulnerable to spreading misinformation when responding to malign actor prompts, with 8 out of 9 false claims stemming from such interactions.
This just goes to show how it and other tools could easily get weaponized by threat actors spreading misinformation on a large scale. It’s all very interesting as the assessment comes at a time when the AI race keeps heating up between America and China. The chatbot’s Terms of Use say that many users need to proactively confirm its authenticity and reliability to stop the spread of fake data.
It’s a policy that experts dub to be hands-off as it transfers the burden of evidence from developers to end users. For now, the China-based open-source AI platform failed to reply to any requests for comments on these audit findings.
DeepSeek will now be included in NewsGuard’s monthly AI audits, with its results anonymized alongside other chatbots to track industry-wide trends.
This is clear evidence that while DeepSeek might be making a mark in the world of marketing, it’s got a long way to go. With a failure rate so high, indeed, it cannot be depended upon for reliable data. For this reason, users need to double-check information before relying on any chatbot.
Image: DIW-Aigen
Read next: AI Tools Can Enhance Creativity, But Human Input Still Crucial for Copyright Protection