Wikipedia:Reference desk/Archives/Computing/2022 December 5
Computing desk | ||
---|---|---|
< December 4 | << Nov | December | Jan >> | Current desk > |
Welcome to the Wikipedia Computing Reference Desk Archives |
---|
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
December 5
[edit]It can hold up to 1.8×10^19 different values. I don't understand why that huge number translates to only around 450 gigabytes on my hard drive? Shouldn't my disk drive able to hold much more bytes? I'm not too familiar in computer hardware, so this may sound like a dumb question. 2600:6C44:117F:95BE:FCB3:FE0F:795:3BC9 (talk) 22:18, 5 December 2022 (UTC)
My second question is how come 64 bit computer only have 8 GB in RAM? That seems way too low compared to the big number above. 2600:6C44:117F:95BE:FCB3:FE0F:795:3BC9 (talk) 23:14, 5 December 2022 (UTC)
- The "64 bits" in this case means that the computer is capable of efficiently working with 64-bit values.
- There are a couple of key words in there:
- "Capable" - the computer can handle 64 bits at a time - but that doesn't mean it always does handle 64-bits at a time!
- "Efficient" - most computers can handle any number of bits. But, a "64-bit computer" can do this efficiently (compared to, say, a 32-bit system).
- This is all a bit wishy-washy: the reality is that the details are really quite complicated. Modern computer architectures sometimes use different bit-sizes for different parts of the system; some of the math hardware might be 64-bits wide; some of the memory interfaces might be 128-bits wide; some of the peripheral data might even be going over wires that are essentially 8-bit or even one-bit wide!
- One of my favorite computer books is Patt and Patel's Introduction to Computing Systems. If you're at the level of knowing how to do binary math, but not really knowing how that math affects the capabilities of a real-world computer, this book is a great way to connect the concepts together, and really get into the nitty-gritty of what every part of the phrase "64-bit computer" means, one word at a time.
- Nimur (talk) 01:20, 6 December 2022 (UTC)
- Each single word in a 64-bit wide memory is one of 18,446,744,073,709,551,616 possible combinations. This does not imply that the memory has space to store all 18,446,744,073,709,551,616 different possible words at the same time. There are in fact ultra fast memories that have space for precisely one 64-bit word; these are usually called registers. --Lambiam 03:47, 6 December 2022 (UTC)
- To help make sense of that, is it correct to state that a "word" is the size of the numbers that the CPU is designed to use. Incoming numbers are a word. Outgoing numbers are a word. The math functions are based on words. Basically, the 64 bits is referring more to the CPU than the computer's hard drive or mouse or printer. 12.116.29.106 (talk) 14:36, 6 December 2022 (UTC)
- True, but one should hope that on a 64-bit computer the data bus is also 64 bits wide, which means its onboard memory can serve up a 64-bit word as one chunk, otherwise little is gained. So this involves the whole computer architecture. The term "number" is too restrictive; the data can also represent text. --Lambiam 10:57, 7 December 2022 (UTC)
- To help make sense of that, is it correct to state that a "word" is the size of the numbers that the CPU is designed to use. Incoming numbers are a word. Outgoing numbers are a word. The math functions are based on words. Basically, the 64 bits is referring more to the CPU than the computer's hard drive or mouse or printer. 12.116.29.106 (talk) 14:36, 6 December 2022 (UTC)
- So, I think we need to back up a step and get our terminology right. A bit is the smallest possible piece of information, a "binary digit", which is usually represented by a single "switch" than can be either "on" or "off" (1 or 0). That "switch" takes MANY different forms depending on whether it is a bit on a hard drive, in RAM, being transmitted over a communications network, etc. To say something is "my computer is X-bits" is basically as useless as saying "My car is X liters"? What about the car is X liters? The total volume of the passenger compartment? The size of the fuel tank? The compression volume of the cylinders? There's lots of places in a car where liters is a measurement of something, and without context, we don't know what it means. Likewise, when you say something in your computer is "X bits" (or X kilobytes, or gigabytes, etc.), unless you know what that is measuring, it doesn't mean anything. In this case, 64-bit computing usually means that the standard sized chunk of data that can be processed is 64-bits long, in oversimplified terms, it means your processor can do math with 64-bit numbers in a single operation. This has nothing to do with data storage or communication speed or memory or anything else that may also be measured in bits or bytes. --Jayron32 11:32, 7 December 2022 (UTC)
Question: Literature about improvements on large neural networks that try to do more with the same number of parameters
[edit]I not too deep in ML , but I read articles every now and then (especially about hyped models, GPT and co). I see that there is progress on some amazing things (like GPT-3.5) also because their NN gets bigger and bigger.
My question is: are there studies that check that NN could do more (are more precise or whatever) given the same amount of parameters? In other words, is it a race in making NN as large as possible (given that they are structured appropriately) or is the "utility" per parameter also growing? I would like to know if there is literature about it. It is a bit like an optimization question. "Do more with the same HW" (or parameters) so to speak. Pier4r (talk) 23:32, 5 December 2022 (UTC)
- Much of the recent advances stems from using different architectures ("topologies"), such as recurrent neural networks and transformers instead of unidirectional topologies, and better learning algorithms such as Q-learning, resulting in faster convergence for a wider class of contexts. Just adding more layers can actually degrade performance. --Lambiam 10:33, 7 December 2022 (UTC)