Talk:DALL-E/GA1

GA Review

Article (edit | visual edit | history) · Article talk (edit | history) · Watch

Reviewer: RoySmith (talk · contribs) 16:29, 2 April 2021 (UTC)[reply]

Starting review. My plan is to do two major passes through the article, first for prose, the second to verify the references. In general, all my comments will be suggestions which you can accept or reject as you see fit. -- RoySmith (talk) 16:29, 2 April 2021 (UTC)[reply]

I put this on hold for a week, with no response, so closing this as a failed review. -- RoySmith (talk) 15:39, 10 April 2021 (UTC)[reply]

Shit! I have been busy with a bunch of online stuff happening at the same time. Anyway, I will go through all of these things and probably do some expansion, and nominate again later. jp×g 18:43, 11 April 2021 (UTC)[reply]

Prose

Lead section

"It uses a 12-billion parameter[2] version of the GPT-3 Transformer model to interpret natural language inputs (such as "a green leather purse shaped like a pentagon" or "an isometric view of a sad capybara") and generate corresponding images." For the lead section, I'd leave out the whole "(such as ...)" parenthetical phrase. That's covered in more detail in the main body.

? This one, I'm not sure how familiar a layman would be with the phrase natural language being used to specifically mean "a sentence spoken the way you'd say it to a person" rather than "written in a human language". After all, fetchString = cur.execute("SELECT * FROM threads2 WHERE replycount > 20 AND viewcount/replycount > 300 ORDER BY forumid, replycount DESC LIMIT 1000") is an English-language sentence. You are correct that this is an unreasonably long sentence, though. I will ponder this. jp×g 19:10, 2 April 2021 (UTC)[reply]
JPxG, You've linked to natural language. Somebody who is not familiar with AI jargon can click through to find out what it means, and that fact that it's linked should be a clue that it has some special meaning. But, maybe if you want to give a little more, replace the current "(such as ... capybara)" with something like "(conventional human language)"? I'm assuming this was trained on an English-language corpus; you should verify that and mention it somewhere if it is indeed the case. -- RoySmith (talk) 20:07, 2 April 2021 (UTC)[reply]

"It is able to create images" -> "It can create images"

Y Done. jp×g 19:07, 2 April 2021 (UTC)[reply]

The long lists of citations on some sentences ("...texture of a porcupine").[2][4][5][6][7][8]") seem like WP:REFBOMB and detract from readability (particularly in the lead section). Can these be trimmed to just the most important sources that actually support the statement?

Y This is a silly artifact of how I wrote the article (write a couple-sentence stub, locate and format all references, then flesh out an article by moving them down into the expanded text), which is definitely unintentional. Fixed. jp×g 19:07, 2 April 2021 (UTC)[reply]

"DALL-E's name is a" -> "The name is a"

Y Fixed. jp×g 19:13, 2 April 2021 (UTC)[reply]

"in conjunction with another model, CLIP" -> "in conjunction with CLIP"

Y Fixed. jp×g 19:13, 2 April 2021 (UTC)[reply]

"OpenAI has refused to release source code for either mode" While this may be true, it's non-neutral. The implication is, "They should have released the source code, they were asked to do so, and they refused". I see you talk about this more later, but here in the lead, you state it in Wikipedia voice, which violates WP:NPOV. Must fix.

? This one is a little tricky: in GPT-2 (frankly, a better article) I wrote at much greater length about the issue there. To wit, OpenAI was founded as a nonprofit, and received funding to develop models, with the explicit goal of making their research open to the public (as opposed to organizations like DeepMind). The decision to not release GPT-2 was widely criticized, and they ended up releasing it anyway after determining that the abuse concerns were not based in fact. However, I agree that this should probably be explained in greater detail than just saying they "refused". jp×g 19:38, 2 April 2021 (UTC)[reply]

"one of OpenAI's objectives through DALL-E's development" -> one of DALE-E's objectives"

Y Fixed. jp×g 19:40, 2 April 2021 (UTC)[reply]

Architecture

"model was first developed by OpenAI in 2018" -> I think "initially" works better than "first"

Y Done. jp×g 19:41, 2 April 2021 (UTC)[reply]

"was scaled up to produce GPT-2 in 2019.[10] In 2020, GPT-2 was augmented similarly to produce GPT-3,[11] of which DALL-E is a implementation.[2][12]" -> "was scaled up to produce GPT-2 in 2019, and GPT-3 (which DALE-E uses) in 2020."

I have made an attempt here, but it is still a little awkward. Let me know what you think. jp×g 19:43, 2 April 2021 (UTC)[reply]
JPxG, What you've got now is better. I could suggest some other alternatives, but I think it's fine now. -- RoySmith (talk) 20:10, 2 April 2021 (UTC)[reply]

"It uses zero-shot learning", clarify whether "it" refers to GPT-3, DALE-E, or GPT in general.

Y I've moved that sentence down to an appropriate location, where I think it is much clearer (and where it makes more sense to be anyway). jp×g 19:47, 2 April 2021 (UTC)[reply]

"scaled down from GPT-3's parameter size of 175 billion" -> scaled down from GPT-3's 175 billion"

Y jp×g 19:47, 2 April 2021 (UTC)[reply]

"large amounts of images" -> "a large number of images".

? I will dig my heels in very slightly on this one; I say amounts (plural) since it generates a large amount (singular) in response to each prompt (singular). jp×g 19:47, 2 April 2021 (UTC)[reply]
JPxG, The problem is, "amount" implies a measurement, as opposed to a count. You can have "a large amount of image data", but you can't have "a large amount of images". You can have "a large number of images", or "many images", or "a voluminous quantity of images", or "a boatload of images". -- RoySmith (talk) 20:15, 2 April 2021 (UTC)[reply]
Much to think about. I think you are correct; will fix. jp×g 20:43, 2 April 2021 (UTC)[reply]

"another OpenAI model, CLIP,", this should start a new sentence.

Y jp×g 19:47, 2 April 2021 (UTC)[reply]

""understand and rank" its output", I think "its" refers to DALE-E's here, but clarify.

Y jp×g 19:48, 2 April 2021 (UTC)[reply]

" (like ImageNet)", I'd leave that out completely. Since you're talking about "most classifier models", calling out one in particular doesn't add any value.

ImageNet is a curated dataset of labeled images, not a classifier model. I have edited it to make this a little clearer. jp×g 19:49, 2 April 2021 (UTC)[reply]
JPxG, But, there's still lots of curated image datasets. What's so special about ImageNet that it needs to be called out as the one example you mention as something that wasn't used? -- RoySmith (talk) 20:18, 2 April 2021 (UTC)[reply]

"Rather than learn from a single label", avoid repetition of the word "rather".

Y Rather than use something instead of that rather, I have used instead rather than rather for the previous rather. jp×g 19:51, 2 April 2021 (UTC)[reply]

"CLIP learns to associate" -> "CLIP associates"

Y jp×g 19:51, 2 April 2021 (UTC)[reply]

Performance

As above, WP:REFBOMB

Y Fixed. jp×g 19:39, 2 April 2021 (UTC)[reply]

"quoted Neil Lawrence ... describing it as ..." I think you mean, "quoted Neil Lawrence ... who described it as ..."

Y jp×g 19:53, 2 April 2021 (UTC)[reply]

" He also quoted Mark Riedl" clarify who "he" is.

Y jp×g 19:53, 2 April 2021 (UTC)[reply]

Implications

"DALL-E demonstrated "it is becoming", not sure, but maybe, "DALE-E demonstrated that "it is becoming"

Y jp×g 19:54, 2 April 2021 (UTC)[reply]

My overall impression is that this reads like a publicity piece for OpenAI. The vast majority of the quotes are extolling the virtues of the system, with only one or two examples of problems, and even those are in the context of, "but he's an example of what it does better". The REFBOMB aspect is part of the problem, but it's deeper than that. I'm going to put the rest of the review on hold for a week to give you a chance to address that. -- RoySmith (talk) 17:54, 2 April 2021 (UTC)[reply]

Thanks for taking the time to review! I will go through it now. jp×g 19:01, 2 April 2021 (UTC)[reply]

Okay, have gone through it. I think that the lack of negativity in the article is mostly a consequence of OpenAI's embargo; nobody can access the code outside of an extremely narrowly-controlled demo which is more like a photo album than an interface (which, nevertheless, I strongly advise you to check out to form an opinion on the model). They also took a fair bit of time to release the paper, which I will admit to not having had time to go through yet, and I think this would also enable a fairly neutral description of what it does. Some of the more cynically minded opinion-havers called the GPT-2 embargo a deliberate strategy to hype up the GPT-2's capabilities when they did it with that model. I will go hunting for some more stuff to add to the article, though. jp×g 20:00, 2 April 2021 (UTC)[reply]

Images

I took another look at this. The infobox image File:DALL-E sample.png claims to be MIT-licensed from https://openai.com/blog/dall-e/. I can't find the image there. The Commons page is also lacking author information.