Project

General

Profile

Actions

Feature #3044

open

Make RegExp for each language alphabet and numbers

Added by Gregory Magarshak over 1 year ago. Updated over 1 year ago.

Status:
New
Priority:
High
Category:
Qbix Maintenance
Start date:
12/03/2023
Due date:
12/06/2023 (about 18 months late)
% Done:

0%

Estimated time:
4.00 h

Description

Given a language string (e.g. "ru" or "zh") we want to get a RegExp that captures the characters from that language. For languages like Chinese, which have many characters, you will need a Unicode Range. For others, you can list out the actual letters of their alphabet. Sometimes, their digits are written differently also.

The output should be like this:

{
  "en": ["[A-Za-z]+", "[0-9]+"]
  "af": ["[multipleRanges]+", "[someRange]+"],
  "ru": ...,
  ...
}

What we can use it for

When sorting lists by first letter, for example, we want to show people their language first, before English. But if we sort by Unicode, the Latin alphabet letters will always come first. To solve this problem we will need alphabets at least for the main languages that we use:

ar, zh, fr, de, he, hi, it, ko, ms, pt, ru, es, uk, vi, en

Resources

https://jrgraphix.net/r/Unicode/0590-05FF <-- probably this is easiest

http://unicode.org/charts/

https://jrgraphix.net/research/unicode_blocks.php

https://www.ling.upenn.edu/courses/Spring_2003/ling538/UnicodeRanges.html

Use your judgment and logic using this. It doesn't have to be perfect, e.g. Russian doesn't have "i" but Ukrainian has it. Just try to fill the languages above. Some are harder, like "hi" for Hindi. You would have to do some research: https://stackoverflow.com/a/9523932

Deliverable

Please generate a file, plugins/Q/js/RegExp.js

And move Q.RegExp from Q.js to that file. Look at Q/js/Colors.js

That file would contain all the letters and codes, and the method Q.RegExp.letters(language), with language defaulting to navigator.language.

Actions #1

Updated by Gregory Magarshak over 1 year ago

  • Description updated (diff)
Actions #2

Updated by Gregory Magarshak over 1 year ago

  • Description updated (diff)
Actions #3

Updated by Gregory Magarshak over 1 year ago

  • Description updated (diff)
Actions #4

Updated by Gregory Magarshak over 1 year ago

  • Description updated (diff)
Actions #5

Updated by Gregory Magarshak over 1 year ago

  • Description updated (diff)
Actions #6

Updated by Gregory Magarshak over 1 year ago

  • Description updated (diff)
Actions

Also available in: Atom PDF