Topic: Open-R1: a Fully Open Reproduction Of DeepSeek-R1
Hey there! This article is an intro to the job, not a claim that we have actually recreated R1 yet. We're constructing in the open, so as quickly as we have examination numbers, we'll share them. You can follow our development on Hugging Face and GitHub.
True, however it looks like there's absolutely nothing to be evaluated as of today. I presume the ultimate objective is to train a new reasoning design and after that utilize the exact same assessment metrics as o1 and the DeepSeek-R1.
Well, there need to be at least some sanity check and validation to make sure the model was trained correctly.
Oh yes, if you are speaking about the assessment number of deepseek's model it's coming soon!
As discussed in the article there is no design called Open-R1 to evaluate at all ... not yet anyway. This is a blog outlining that Hugging face will take the R1 Deepseek model, work out how it was developed as laid out in the paper and from what they launched, and after that reproduce that process.
in fact this is pretty much how science works ... A comes up with a plan, discovery or development and it is evaluated by B, C and D to see if it is reproduceable. Thats been the cornerstone of research now for a couple of centuries.
This blog site is not saying they have actually currently done so ... Its a blog outlining an intent to start training a design like R1 and calling it Open-R1.
Also DeepSeek-R1 was only released last week, and even in their paper they described the calculate hours required. While those are low calculate hours for a SOTA design this does not indicate you can train stated design in a week. I 'd personally love to be able to train a transformer design in a week, however we might require to wait a while for that level of calculate technology.
So there are no standards for a model that has not been developed yet right? As described in the blog site, and again in reply to your question.
However fear not, there is a GitHub Repo already and contributors (hell I might join myself), some prelim work done, and a master plan. A great starting position.
n
@edbeeching
has actually evaluated the launched models currently
( src: https://x.com/edwardbeeching/status/188 … 136275742)
R1 simply trained on o1 outputs, so collectively .../ s. This is what the brand-new AI czars are saying
Hi! This post is an introduction to the project, not a claim that we've recreated R1 yet. We will completely share the missing piece when we have them, you can anticipate the models and datasets to be upload in this Hugging Face org and the code to be in this GitHub repo
That's nice and crucial to comprehend this remarkable hype that lacks technical comprehension and explanation. Science has to do with recreation, and if they declare to be open, let them fullfill the open part.
Please do publish the training expense.
We will!
Excalidraw Hi n
@bojan2501
thanks, we will indeed be working hard to make sure this training recipe can work for small language designs on customer hardware given that not everybody has a cluster of H100s in the house:-RRB- The tool we utilized for the images was Excalidraw! https://excalidraw.com
looking forward to it! WTF are your speaking about?
need to be a joke
It's truly cool to see how the whole open source community comes together!
Ops ...
5.5 M is number press reporter in the deepseekv3 tech report (just the training, not the experiment afaik), for R1 tough to estimate tbh however much less than 5.5 M imo
Historically, they have never ever launched code or datasets of their LLM training, so I wouldn't expect this time to be various. If they would release it that would be remarkable naturally!
Yes of course!
So essentially you're asking to change existing censorship with another flavour of censorship?
The code for the models are inside the design repositories, e.g. for V3: https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/modeling_deepseek.py
Hello Team, I'm Ray Bernard, the author and developer of EQUATOR. My research study team will be dealing with a paper focused on replicating particular parts of DeepSeek R1. Our aim is to recreate the cold start and provide your group with a dataset that includes COT and other techniques to support these efforts. We like to contribute our work to assist. Please let me understand if you find this useful. Best, Ray Bernard https://www.facebook.com/groups/1186310571520299/
Where is the evaluation numbers? without it you can't call it recreation.
8 replies
True, however it appears like there's absolutely nothing to be examined since today. I presume the ultimate goal is to train a new thinking model and then utilize the very same examination metrics as o1 and the DeepSeek-R1.
That's rather fascinating, I was asking myself why the concerns the author exposed here are not being asked by others? I think the work they have done is remarkable but at the very same time I question why they would not put these missing out on pieces on if they are expected to be totally open.
Why even without reproduction and understanding of the development they could impact so much the marketplace in this way?
4 replies
Hi! This article is an introduction to the project, not a claim that we've recreated R1 yet. We will absolutely share the missing out on piece when we have them, you can expect the models and datasets to be upload in this Hugging Face org and the code to be in this GitHub repo
Interesting read, and it is good that we see more effort into this direction: more optimization and less brute force.
Also wonder what tool did the author use for developing step diagram.
2 replies
Excalidraw I'm so thankful that initiative like this already exist, I'm gon na try to contribute:-RRB- 1 reply
looking forward to it! So racist articel
2 replies
WTF are your talking about?
Awesome to have this open recreation started!
For Step # 1 check out https://github.com/open-thoughts/open-thoughts!
https://x.com/ryanmart3n/status/1884284101265612856
Let's do this thing!
1 reply
It's truly cool to see how the whole open source community comes together!
Does anybody know the real training cost of r1? I can't find it in the paper or the statement post. Is the 6M expense reported by media just the number drawn from v3's training cost?
2 replies
Ops ...
Has anybody asked the DeepSeek group to release their training data and code, or a minimum of share them privately with an independent duplication job like this? Have they declined such a demand?
A devoted duplication depends on utilizing the exact same dataset and hyperparameters. Otherwise, any major inconsistencies with the released benchmarks would be difficult to pin down-whether due to training information differences or the replication method itself.
1 reply
Historically, they have never released code or datasets of their LLM training, so I wouldn't anticipate this time to be various. If they would launch it that would be fantastic of course!
In the meantime we need to make best guess estimates and see if we can arrive ourselves.
You provide great replication procedure of Deepseek thinking training. I will try something similar to it.
This is really good info, can we fine tune with particular use case when code is launched?
1 reply
Yes naturally!
Please think about removing biased, polluted or unaligned training data and make an effort to get rid of copyrighted works from the crawl from consumption. This will make the design more functional. If you recycled anthropic curation checks, this might also assist, remove obviouslybiased data will likely include a lot of value. We don't want another tainted, unaligned open source model, right? And no business would ever utilize deepseek or a model that recycles it, right?
We appreciate your work for the benefit of humanity, we hope.
Miike C from NJ
1 reply
So essentially you're asking to replace existing censorship with another flavour of censorship?
Can't wait! Hopefully the design will be uncensored however whatever you can do is alright! Love seeing open source building itself up. I'm not wise adequate to really assist but I can contribute ethical support lol
Hello guys, I am even just looking for code for DeepSeek-V2, in order to totally understand multi-head latent attention. You do not seem to have code in Hugging Face even for that. Or am I missing something? Don't see anything in src/transformers/models. MLA is not appropriately explained in their paper, so it would be necessary to have code for this.