JavaScript 中的数字、英文、汉字拼音排序

这两天看自己博客，发现标签页的排序有点奇怪，汉字没有按照拼音顺序排，我自己都找不到。当时写代码的时候直接用了 sort，只大概看了一眼，字母排序是对的，字母也在汉字前面，就没太在意细节，今天找标签的时候才发现问题。

这是之前的代码：

tags.sort((a, b) => {
  return a.group < b.group ? -1 : 1
})

代码简单粗暴。但是按照我预期的逻辑，应该是按数字、英文字母、汉字拼音的顺序排。这里就有一个问题，汉字字符串、数字、英文这些怎么比大小？MDN 上是这么写的：

First, objects are converted to primitives using Symbol.ToPrimitive with the hint parameter be 'number'.

If both values are strings, they are compared as strings, based on the values of the Unicode code points they contain.

Otherwise JavaScript attempts to convert non-numeric types to numeric values:

Boolean values true and false are converted to 1 and 0 respectively.

null is converted to 0.

undefined is converted to NaN.

Strings are converted based on the values they contain, and are converted as NaN if they do not contain numeric values.

If either value is NaN, the operator returns false.

Otherwise the values are compared as numeric values.

简单讲就是：

都是字符串，按照 Unicode 来
否则，字符串尝试转为数字再比较（js 弱类型），转不了就是 NaN
NaN 大于数字

所以说...对比的时候要把汉字单独拎出来，汉字 Unicode 的顺序和拼音并不一致，直接上新代码：

// 注意 tags 里面全部为 string
const tags = ['标', 'a', 'g', '2', 'x', '1', '张', '前']

const sortedTags = tags.sort((a, b) => {
  const regexp = /[a-zA-Z0-9]/

  if (regexp.test(a) || regexp.test(b)) {
    // a b 中有至少一个数字、字母的，还是老办法
    // 我这里忽略了大小写，你按照你的需求来
    return a.toLowerCase() < b.toLowerCase() ? -1 : 1
  } else {
    // 中文的，用 localeCompare
    return a.localeCompare(b, 'zh')
  }
})

console.log(sortedTags)

// output
// (8) ['1', '2', 'a', 'g', 'x', '标', '前', '张']

不熟悉 localeCompare 的看这个文档。