Jump to content

Wikipedia:Reference desk/Archives/Computing/2017 December 14

From Wikipedia, the free encyclopedia
Computing desk
< December 13 << Nov | December | Jan >> December 15 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


December 14

[edit]

Computerized ordering of Chinese characters

[edit]

At work yesterday I sorted a very long Excel sheet of book titles from numerous languages, and as expected, all the Chinese characters showed up together at the end. As far as I could tell without examining it closely, titles with similar characters were adjacent to each other; in particular, I marked duplicates (Conditional Formatting), and they always appeared next to each other, even though they'd not been adjacent before the sort.

How do Excel and other major English-language-developed programs sort Chinese characters? Is it typical to represent them with ASCII characters (e.g. sorting them by their binary representation), or do some such programs have a sense of stroke order? I'm talking about general-use programs, rather than software that is primarily linguistic. Nyttend (talk) 13:29, 14 December 2017 (UTC)[reply]

There are two main ways.
  • Sorting by byte order, on the encoded form of the characters. This typically works quite well within a character set, but is case sensitive, works badly between two character sets and badly between mixtures of character sets (such as accented é relative to e). It is technically easy though, robust and repeatable for a full Unicode range, even in legacy systems that are based on 8 bit ASCII encodings.
If you can achieve nothing else with a software project, make it UTF8-clean end-to-end, so that at least it doesn't break anything.
  • Collation is an important feature for any database these days, or other systems that take any sort of serious effort over text comparisons. The sophistication of the collation varies (I don't know Excel's offhand) but it begins with such simple processes as case-insensitivity, é vs e and the likely issues for European languages. Going outside Europe (or even into Cyrillic Eastern Europe) depends on vendors. Programming language support does not typically extend this far - you're looking at higher level services, like databases.[1]. Going beyond Europe into CJKV still requires care: a recent platform offering full Unicode support might work, but there are still plenty of major packages which will break here.
  • As to radical-and-stroke sorting, it's hard to even have that conversation within the software industry. Finding support for it is rare, finding a large system where such things work correctly end-to-end (where there might be products from two dozen vendors involved) is almost unheard of for software developed in Europe. Most annoyingly, it tends to stop working in European software that successfully implements it, but goes into a 'maintenance phase' (not an Agile concept) where management no longer understand or support proper i18n. It's a bit better on the US west coast, but really the main principle is always that, "If we don't use it ourselves, the ability is soon lost." Post-Brexit, some large UK suppliers are already claiming 'advantages' that this expensive internationalisation will no longer need to be supported. Andy Dingley (talk) 16:57, 14 December 2017 (UTC)[reply]
Excuse my ignorance, but is stroke order not simply (part of) a recipe for forming the glyph corresponding to a given character? I would assume that an ordering of words would be based only on the resulting character, not on the way it was produced. As such, I would assume a fixed ordering of characters, and by lexicographic extension, of words. If UTF captures that particular order is a different question, but that would only need a simple transposition table. Am I missing something? --Stephan Schulz (talk) 17:44, 14 December 2017 (UTC)[reply]
Radical-and-stroke sorting has a long tradition within Chinese and Japanese calligraphy, Four-Corner Method another fairly established history. Children learn them this way, character dictionaries are ordered this way. It's more than a simple way of ordering glyphs, such as counting descenders in Latin characters would be (possible, but with no established practice behind it).
Technically it's not even that hard to do it. What's needed is interest and commitment from the software vendors. Andy Dingley (talk) 18:02, 14 December 2017 (UTC)[reply]
Ok, I looked up Radical-and-stroke sorting - quite interesting, thanks. But as I understand it, it again induces a way to order the characters - very useful for humans, who then don't need to remember an arbitrary ordering of a large set of characters. But for a computer, remembering an arbitrary order of some few thousand characters is not really a problem nowadays. Of course, I assume the set of characters is fixed and finite - maybe that is my mistake? --Stephan Schulz (talk) 17:22, 15 December 2017 (UTC)[reply]
Thanks for the pointer to radical-and-stroke sorting; I'd conflated the two concepts and thought that the way you write a character and the way it's organised in the dictionary were related. Nyttend (talk) 01:35, 16 December 2017 (UTC)[reply]
  • These are interesting too:
  • Chinese typewriter anticipated predictive text, finds Stanford historian, Stanford Report, November 28, 2012
  • Thomas S. Mullaney (October 2012). "The Moveable Typewriter: How Chinese Typists Developed Predictive Text during the Height of Maoism". Technology and Culture. 53 (4): 777–814. doi:10.1353/tech.2012.0132.
Andy Dingley (talk) 12:19, 17 December 2017 (UTC)[reply]

Using undefined variables

[edit]

If a computer program is confronted with an unknown variable, couldn't it just wait-and-see instead of raising an error?

For example, a=b+b followed by b=8 would raise a NameError in Python. But b=8 followed by a=b+b is OK. Couldn't Python just remember that the value of a is 2b and wait until it is needed as a numerical value and, if it does not get the value, then raise the error?--Hofhof (talk) 18:15, 14 December 2017 (UTC)[reply]

In procedural languages like Python the equals sign '=' means something quite different from in mathematics. It means assign the value of the right hand side to the variable on the left. The value of the variable can be changed to something quite different a few lines later. See Procedural programming. Dmcq (talk) 18:22, 14 December 2017 (UTC)[reply]
(ec) Well, it could, but it would be a rather strange thing to do, and might go against the principle of least astonishment. The statement a = b + b in Python (and its equivalent in other procedural languages) means 'calculate the current value of b + b and assign that to the variable a, not 'define a to be the whatever the value of b + b is when a is referenced. By that logic, the statements b = 8, a = b + b, b = 16, print (a) might be expected to output the value 32. AndrewWTaylor (talk) 18:25, 14 December 2017 (UTC)[reply]
Some computer languages do support this; it's called Lazy evaluation.-gadfium 18:54, 14 December 2017 (UTC)[reply]
Lazy evaluation applied more to functions, which may not be computed until the results are required. Computer algebra system may be more useful to you. Mathematica is an example of a language/system which can support statements similar to your example.-gadfium 19:03, 14 December 2017 (UTC)[reply]
Declarative programming languages do this sort of thing. In these b = 8, a = b + b, b = 16 would be wrong because b is declared twice, but a = b + b, b = 16 is okay and sets a to 32. Dmcq (talk) 23:22, 14 December 2017 (UTC)[reply]
That would be unnecessarily complicated. It is best if it knows what b is when it reaches a=b+b. Bubba73 You talkin' to me? 22:33, 14 December 2017 (UTC)[reply]
  • As noted, declarative programming (and its sometimes massive overheads) are there for this. It's not normally needed, so why support a costly feature that's not needed? In this case, a simple function definition or even a lambda would do the job far more simply (and more pythonically).
More importantly, why would you write code this way? Chances are that you didn't mean to, it was an accidental typo. In which case 'failing early' during compilation is by far the better strategy, allowing the mistake to be corrected. Maybe a is required to be 16 and a later assignment of b = 7 (and implicitly then setting a to 14 instead) would rain fiery death down upon the metropolis. Andy Dingley (talk) 23:47, 14 December 2017 (UTC)[reply]
While Python doesn't work this way, I can imagine a language that does, allowing for "spreadsheet-like" programming. A language like Prolog also finds out by itself in which order to do the evaluation.
 //Define formulas
 P=I*V; 
 R=V/I;
 //Fill the knowns
 AskAndFill("Do you know Voltage",V) 
 //etc for Power,Resistance, Current
 If R.HasValue Print("Resistance (Ohms) is " + R)

Joepnl (talk) 00:03, 15 December 2017 (UTC)[reply]

Yea, well at "a=b+b", it doesn't even know what type b is. You might have to keep b undefined (and every one similar) for a very long time. And then a depends on b, and c may depend on a, and d may depend on c, etc. Bubba73 You talkin' to me? 02:52, 15 December 2017 (UTC)[reply]
And what about "a=b+c", with neither defined. What if later b is set to a string and c is set to a number? Bubba73 You talkin' to me? 23:46, 15 December 2017 (UTC)[reply]
That's probably a good time to raise an exception, or less subtle, decide to treat c as a string in "a=b+c". I don't understand why remembering "b is referenced by equation X" without knowing the future type of b would be a problem? If you want to be asolutely sure at compile-time, replace R=V/I by R(double)=V(double)/I(double) which says that it must be impossible to assign a string to V. Joepnl (talk) 00:46, 16 December 2017 (UTC)[reply]
It is a lot of unnecessary work that could be caught at compile time. I think it is also bad design. Bubba73 You talkin' to me? 00:59, 16 December 2017 (UTC)[reply]
Unless you have 40 formulas and don't know in advance in which order they should be executed. CanStartCar=SittingInCar and HasKeys. CarNeedsStarting=NeedToDrive. NeedToDrive=NeedToBeAtShop and not CurrentLocation.IsShop, etc. It is a mixup of declarative and imperative languages, which I don't like actually, but I can see that it could be useful. Joepnl (talk) 01:18, 16 December 2017 (UTC)[reply]
I once wrote a little command language for running SPICE scripts where one could write values like 3V or 30pF for instance and it would do dimensional checks in calculations. Dmcq (talk) 10:54, 17 December 2017 (UTC)[reply]

Resetting default PDF reader in Windows 10?

[edit]

On my home PC, I'm running Windows 10 (professionally re-installed a week ago after an unrepairable startup error appeared). When I want to open a PDF document already downloaded, this currently defaults to using Microsoft Edge, with various alternatives also offered.

I would prefer not to use a browser program or this, but instead to use LibreOffice 5.4, which I have on the computer and which can open PDFs in its Draw application, but this is not one of the options offered.

While I can open LibreOffice Draw and then search for and open a given PDF document, I cannot figure out how to make it my .pdf default such that I can first go to the document and then open it with LibreOffice 5.4 as the default or even an alternative option.

If I go to Windows 10's

Settings, Apps, Default apps, Choose default applications by file type,

then .pdf is already set to Microsoft Edge with no way I can see to change that to anything else.

Suggestions? (Please keep things as untechnical as possible!) {The poster formerly known as 87.81.230.195} 90.220.212.173 (talk) 21:00, 14 December 2017 (UTC)[reply]

I'm not an expert on this, but if you get to settings/"Choose default apps by file type", the application extensions are listed in order on the left. Go down to "pdf", click the icon to the right. Then it should give you choices of the software you have installed. Bubba73 You talkin' to me? 22:30, 14 December 2017 (UTC)[reply]
In my experience you can install whatever PDF reader your like (Evince, Adobe Acrobat, etc) but W10 will constantly reset the default to Edge. It seems to me that Microsoft decided that they know best when it comes to what to open a PDF file with. You can fix this, but you're going to have to get you hands dirty with Regedit
  • regedit HKEY_CURRENT_USER\SOFTWARE\Classes\Local Settings\Software\Microsoft\Windows\CurrentVersion\AppModel\Repository\Packages\Microsoft.MicrosoftEdge_25.10586.0.0_neutral__8wekyb3d8bbwe\MicrosoftEdge\Capabilities\FileAssociations
  • look for magic number next to pdf
  • regedit HKEY_CURRENT_USER\SOFTWARE\Classes\<Magic number>
  • in the right hand pane, create a new string value - NoOpenWith
  • repeat for html, htm, etc

HTH --TrogWoolley (talk) 10:53, 15 December 2017 (UTC)[reply]

Thanks for responding, Bubba73 and TrogWoolley. When I follow Bubba73's procedure (which I'd already tried, the displayed icon being that for Microsoft Edge), the choices offered are Microsoft Edge, Firefox, and Look for an app in the store. When I now click that third option, I'm offered a choice of 293(!) apps, many of which are free, but none of which are LibreOffice which I already have and can use in the non-default mode described above.
TrogWoolley's suggestion is a long way beyond the limits of my technical comfort zone, and having just paid £65 to a professional shop for an investigation and eventual Windows 10 reinstall (which I could perhaps have attempted myself, but didn't dare) to correct the aforementioned startup fault, I lack the financial fortitude to essay it.
I'll probably stick with opening LibreOffice and then searching for the desired document, rather than opening the document directly. I'll also consider migrating to Macintosh some time in the future: Microsoft is getting too nakedly feudal for my taste. {The poster formerly known as 87.81.230.195} 90.220.212.173 (talk) 11:08, 15 December 2017 (UTC)[reply]

I have had Adobe Reader as my default PDF reader for nearly a year now, and before that on a different installation for I think the same amount of time or more. (Also most of my other defaults are not Microsoft's choices.) From my experience on Windows 10 Pro, which seems to be supported by (nonRS) refs [2] [3], if you set a default program yourself it stays. The only time it tends to change is possibly for major upgrades (e.g. Creator's update etc) and when something messes with the default programs.

Windows 10 doesn't like other stuff messing with the defaults, and if it detects this it will reset to the original defaults (recommended programs). If you keep installing junky programs that don't understand this or your installed programs are junky and still don't understand this and you give them admin access, then you will have problems. If not, you should not have to fool around with the registry. In other words, the solution recommended above should work fine for any program which has registered itself as being able to open PDFs, provided you aren't installing or running junky programs, with the slightly possibility of needing to fix it every half year or so at worse after major upgrades. (IIRC this only happened once or twice due to new programs or some other reason.

Now if your program has not registered itself as supporting PDFs it's a little more complicated mostly since Microsoft are trying to push you to store apps, although you still shouldn't need to fool around with the registry. One simplish solution is to right click on the file and look for the 'open with' option. Mouse over and in the submenu there should be an option to 'Choose another app'. As an alternative, if you select a PDF file and then go to the home tab, there should be an arrow next to the Open option, and you can also click on 'Choose another app' here too.

Either way, you should now get the list of apps registered as opening that file type. Scroll down to the bottom and there is a 'More apps' option. Clicking on that will show other apps, I think any that are set as defaults for any file or something. Maybe whatever you want to use will show up here. If not, scroll down to the bottom and there is a 'Look for another app on this PC' option. You will then need to browse to the program that you want to use to open PDFs. (With modern programs, this can sometimes be a little complicate if they have multiple executables, and you're not sure which is the right one.)

Nil Einne (talk) 11:40, 15 December 2017 (UTC)[reply]

Hello, Nil Einne: Thanks for this. I'd got as far as 'Look for another app on this PC' and searching to find the Libreoffice file, but then couldn't figure out how to open the document in it. Encouraged by your advice, I've now gone on to drill down to LibreOffice's 'sdraw' sub-subfile and open a document, and now LibreOffice has been added to the other two main options (Edge and Firefox) offered under 'open with' when I right click on any PDF document. Not quite the default, but close enough for my needs.
Thanks again to yourself, and to Bubba73 & TrogWoolley. I'll mark this as . . .
Resolved
{The poster formerly known as 87.81.230.195} 90.220.212.173 (talk) 12:29, 15 December 2017 (UTC)[reply]