User talk:Kjerish

From Wikipedia, the free encyclopedia

Image

Hi Kjerish. I noticed the image you added. I think an image explaining the training steps of GPT models can indeed be very useful. The diagram looks accurate. One suggestion however, I propose to include the image in the articles ChatGPT and Generative pre-trained transformer instead of Fine-tuning (deep learning) and Reasoning language model, because fine-tuning is broad and includes many techniques, and the image does not illustrate the core innovation of reasoning language models. Also, I'm not sure whether it would be an improvement or not, and it's ok if you don't want to change, but have considered making the image vertical? (the horizontal format may not be ideal for readability on phone screens) Alenoach (talk) 18:36, 27 July 2025 (UTC)

@Alenoach: Hi Alenoach, thanks for reaching out. I took inspiration from the diagram made by OpenAI for InstructGPT, released ~10 months before ChatGPT. I could see the argument you're making in terms of scope. I tried to qualify it in a way that was clear that it's referring to a specific thing but I think in the case of the Reasoning language model article I should just make a new diagram. I'll give the vertical layout a shot too. Once I do that I'll add it to the ChatGPT and Generative pre-trained transformer articles. I am not overly attached to the way that it is, just thought that it should exist in some form. I uploaded the TikZ code in case anyone wanted to change it but I have some time right now to work on it – Kjerish (talk) 21:10, 27 July 2025 (UTC)

Thanks! For the vertical layout, might be worth getting some more feedback from other people, I'm not really confident that vertical is better. Perhaps adjusting the font or font size could be another way to make it easier to read. I made an edit to the article on LLMs btw. Alenoach (talk) 21:43, 27 July 2025 (UTC)

@Alenoach: What do you think of it now? I added 4 changes: vertical layout, examples, order of magnitude, legend. Might be too much at once but the other 3 aspects are probably things that would have been added over time anyway – Kjerish (talk) 23:17, 27 July 2025 (UTC)

I suggest to undo and upload the new (vertical) version in a new Wikimedia Commons page, in case someone wants to use the old version (also because the current image parameters on Wikipedia would need to be adjusted for the new image to not be oversized). Alenoach (talk) 23:42, 27 July 2025 (UTC)

I undid the upload as a temporary solution to avoid oversized images on Wikipedia articles, I hope that's ok to you. Alenoach (talk) 00:20, 28 July 2025 (UTC)

For the blue boxes, it seems well-researched but I worry that these orders of magnitude would be outdated or may not be accurate for some models. Maybe also a few aspects could be adjusted to be a little easier to understand for people who discover the topic, for example "KL" may be confusing to many readers. Otherwise, the design is clean and the yellow explanations are insightful. Alenoach (talk) 23:56, 27 July 2025 (UTC)

@Alenoach: Updated – Kjerish (talk) 03:03, 28 July 2025 (UTC)

It looks good!

A few minor suggestions: "The model acquires grammar, facts, and coding patterns from raw text." may be simplified to "The model learns grammar, facts and coding from raw text." Also, the reinforcement learning step of stage 3, people may not understand that the goal is to train a reward model to predict human preferences, and then train the GPT model to satisfy this reward model, thus assimilating human preferences.

Also, for your information, training the reward model doesn't have to involve the model that is being trained, in principle you could reuse a reward model that was trained using the outputs of a completely different LLM (a benefit of RLHF is that once the reward model is trained, you don't need humans to label data anymore). Also, the SFT and RL steps can be mixed (afaik companies make multiple iterations of SFT and RL). The graph still looks fine though, that's just technical details.

On phone, it looks better, but it's still a little difficult to read for me, something like slightly increasing font size could make it more comfortable to read without the need for people to zoom. I let you decide. Alenoach (talk) 04:49, 28 July 2025 (UTC)

@Alenoach: Updated caption text and made it larger. I kept the arrows in place but mentioned that the base model for the reward model might be a different origin.

I saw a cool video about reward modeling recently. We've gone from doing P(x,y) -> ELO to P(x,y,y') -> Probability y'>y (which is obvious).

But in the future, there will be an input that describes the human grading it so that we can have a single reward model that can guess how different types of people will grade a given prompt. Then we could reduce bias by having benchmarks for how different demographics of people will grade particular answers. That and maybe each user will have their own fine-tuned LLM. Pretty creepy – Kjerish (talk) 06:15, 28 July 2025 (UTC)

That's an interesting development; hard to tell whether it will turn out to be socially good or detrimental for chatbots to be individually customized.

I propose to now move the image from the articles Fine-tuning (deep learning) and Reasoning language model to ChatGPT and Generative pre-trained transformer.

These articles make a lot of views by the way, so that should be really useful. Thanks! Alenoach (talk) 04:00, 29 July 2025 (UTC)

@Alenoach: Done. The ChatGPT one is a great idea. Hopefully more people will see it and improve up on it. Thank you for your suggestions! – Kjerish (talk) 04:20, 29 July 2025 (UTC)

Nomination for discussion of Template:Number of executive orders signed by Donald Trump

Template:Number of executive orders signed by Donald Trump has been nominated for discussion. You are invited to comment on the discussion at the entry on the Templates for discussion page. WikiCleanerMan (talk) 21:53, 10 September 2025 (UTC)

ArbCom 2025 Elections voter message

Hello! Voting in the 2025 Arbitration Committee elections is now open until 23:59 (UTC) on Monday, 1 December 2025. All eligible users are allowed to vote. Users with alternate accounts may only vote once.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2025 election, please review the candidates and submit your choices on the voting page. If you no longer wish to receive these messages, you may add {{NoACEMM}} to your user talk page. MediaWiki message delivery (talk) 00:38, 18 November 2025 (UTC)

Nomination of Embassy of the United States, Tashkent for deletion

A discussion is taking place as to whether the article Embassy of the United States, Tashkent is suitable for inclusion in Wikipedia according to Wikipedia's policies and guidelines or whether it should be deleted.

The article will be discussed at Wikipedia:Articles for deletion/Embassy of the United States, Tashkent until a consensus is reached, and anyone, including you, is welcome to contribute to the discussion. The nomination will explain the policies and guidelines which are of concern. The discussion focuses on high-quality evidence and our policies and guidelines.

Users may edit the article during the discussion, including to improve the article to address concerns raised in the discussion. However, do not remove the article-for-deletion notice from the top of the article until the discussion has finished.

AusLondonder (talk) 11:35, 27 January 2026 (UTC)

Proposed deletion of Embassy of the United States, Bangui

The article Embassy of the United States, Bangui has been proposed for deletion because of the following concern:

Lacking "significant coverage in multiple reliable secondary sources that are independent of the subject" to meet WP:ORGCRIT. Virtually no secondary sources. Most of the content is not specifically about the embassy but about diplomatic relations in general.

You may prevent the proposed deletion by removing the {{proposed deletion/dated}} notice, but please explain why in your edit summary or on the article's talk page.

Please consider improving the page to address the issues raised. Removing {{proposed deletion/dated}} will stop the proposed deletion process, but other deletion processes exist. In particular, articles for deletion allows discussion to reach consensus for deletion based on established criteria.

If the proposed deletion has already been carried out, you may request undeletion of the article at any time. AusLondonder (talk) 11:44, 27 January 2026 (UTC)

Proposed deletion of Embassy of the United States, Astana

The article Embassy of the United States, Astana has been proposed for deletion because of the following concern:

Unnecessary content fork of Kazakhstan–United States relations. Fails WP:ORGCRIT as lacking "significant coverage in multiple reliable secondary sources that are independent of the subject."

You may prevent the proposed deletion by removing the {{proposed deletion/dated}} notice, but please explain why in your edit summary or on the article's talk page.

If the proposed deletion has already been carried out, you may request undeletion of the article at any time. AusLondonder (talk) 11:48, 27 January 2026 (UTC)

"Camicia" listed at Redirects for discussion

The redirect Camicia has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Anyone, including you, is welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2026 February 6 § Camicia until a consensus is reached. consarn _{(talck) (contirbuton s)} 14:05, 6 February 2026 (UTC)

Related Articles