Lemmy Today
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Lugh@futurology.todayM to Futurology@futurology.todayEnglish · 4 months ago

When AI is tested on questions it can't model from pre-existing answers on the internet, it only scores 10% in the test.

qz.com

external-link
message-square
15
link
fedilink
79
external-link

When AI is tested on questions it can't model from pre-existing answers on the internet, it only scores 10% in the test.

qz.com

Lugh@futurology.todayM to Futurology@futurology.todayEnglish · 4 months ago
message-square
15
link
fedilink
Researchers just stumped AI with their most difficult test — but for how long?
qz.com
external-link
A new AI benchmark called "Humanity's Last Exam" stumped top models
  • NuraShiny [any]@hexbear.net
    link
    fedilink
    English
    arrow-up
    6
    ·
    4 months ago

    No, because this test will now be discussed and invalidated for that purpose.

    • Lugh@futurology.todayOPM
      link
      fedilink
      English
      arrow-up
      8
      ·
      4 months ago

      They say the answer to this issue is they’ve released public question samples, but the real questions are kept private.

      https://agi.safe.ai/

Futurology@futurology.today

futurology@futurology.today

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !futurology@futurology.today
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 91 users / day
  • 485 users / week
  • 1.83K users / month
  • 6.29K users / 6 months
  • 29 local subscribers
  • 2.57K subscribers
  • 1.82K Posts
  • 11.4K Comments
  • Modlog
  • mods:
  • voidx@futurology.today
  • Lugh@futurology.today
  • Espiritdescali@futurology.today
  • AwesomeLowlander@futurology.today
  • BE: 0.19.11
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org