2025: A Look Ahead

Jan 9

AI products, research, and policy to watch in 2025

12 Comments

Great writeup! I wonder what impact the EU AI Act might have on Americans, despite their own lack of federal or state laws. Also, with so many of the developments you wrote about in the first half of your post, I am trying to learn where I should focus my AI learning and practices in 2025. What are you putting your time into using, and learning to use?

Expand full comment

Reply (1)

Dean W. Ball

Jan 15

Thank you! We are indeed seeing the Brussels effect happen, with countries like South Korea and about 9 US states (so far) considering or having passed “mini AI acts.”

I primarily use: o1/o1 pro, Claude 3.5, elicit, gpt 4o with search, Gemini deep research, and elicit for academic lit reviews.

Expand full comment

Reply (1)

Jeff Long

Jan 15

Thanks for the reply. I have been using Perplexity for research on the recommendation of @Ben Reid. I’ll check out Elicit. Have you done anything with workflows or multi-agents? The pundits are telling us that’s the next stage. I am looking for an on-ramp to get started trying out the technology.

Expand full comment

Val Neyman

Jan 11Edited

Thank you for the article. You say that new OpenAI o1/o3 models are trained using reinforcement learning, but are not all modern LLMs trained by reinforcement learning of some type (agents with rewards or human feedback)?

Expand full comment

Reply (1)

Dean W. Ball

Jan 11

That's right--everything post ChatGPT, to a first approximation, is trained using RLHF. But many AI researchers would be quick to point out that "RLHF is not *real* RL." RLHF is basically taking human preferences on model responses and using those to guide models to output answers that humans like better. The RL used to train o1, on the other hand, is about getting the model to get to *correct* or at least higher quality answers regardless of human preference.

Expand full comment

Reply (1)

Val Neyman

Jan 11

Do we know how a higher quality answer is determined, criteria?

Expand full comment

Reply (1)

Clyde Wright

Jan 11

Many different ways to calculate "high quality"...but in the case of RLHF, you typically do A/B testing.

Expand full comment

Reply (1)

Val Neyman

Jan 11

My understanding is that the RL used for o1 training does not include a human, rather another model or models. In that case would it be limited to the training of the model that is tasked with deciding if an answer is high quality. Also, are reasoning models similar to the intended purpose of multi-agent workflow. My questions may be basic and I appreciate your responses. Thanks.

Expand full comment

Reply (1)

Clyde Wright

Jan 11Edited

Yes, that's right, it's theorized (based on OpenAI's published research) they have a model generate a Chain-of-Though response to a STEM question, and then use a separate verifier model to evaluate each of the steps in the chain of thought. If the verifier believes any one of the steps is incorrect (even if the model ultimately got the correct answer), then the model is penalized; otherwise, rewarded.

We don't know the exact reason OpenAI is building reasoning models, but in their "5 levels of AGI", level 1 is chatbots, level 2 is reasoning models, level 3 is agents, level 4 is innovators, and level 5 is organizations (ie, models working together to create the work of an entire organization). So presumbly they believe you need reasoning models in order to get to level 3 and beyond.

Expand full comment

Joel

Jan 9

It's just a placeholder bill for now but Senator Wiener is definitely cooking something! https://www.courthousenews.com/california-state-senator-makes-second-play-for-guardrails-on-ai/

Expand full comment

Reply (1)

Dean W. Ball

Jan 9

you are totally right! I knew this but forgot to mention. I will update the post and credit you.

Expand full comment

neuro morph

Jan 10

As someone pretty concerned about catastrophic risks from emerging technology, I've been pondering what could possibly be adequate action by the US government. I've been pretty much coming up blank. Anything remotely adequate seems like a dystopian totalitarian surveillance state. So my thoughts have turned instead to decentralized governance options, with privacy-preserving mutual monitoring enabled by AI. I'll let your AI scan my computer for CBRN threats if you let my AI scan your computer... anything that doesn't meet the agreed upon thresholds doesn't get reported.

I think Allison Duettmann's recent writing on the subject brings up a lot of promising concepts in this space, although no cohesive solutions as of yet. https://www.lesswrong.com/s/9SJM9cdgapDybPksi

Expand full comment

Hyperdimensional

2025: A Look Ahead