Commit 8634a759 authored by Andi Vajda's avatar Andi Vajda
Browse files

fixed bug in logic computing max_char PyUnicode_New()

parent 20e75efe
......@@ -206,7 +206,8 @@ EXPORT PyObject *PyUnicode_FromUnicodeString(const UChar *utf16, int len16)
UChar32 cp;
U16_NEXT(utf16, i, len16, cp);
max_char |= cp; // we only care about the leftmost bit
if (cp > max_char)
max_char = cp;
len32 += 1;
}
......
  • mentioned in issue #155 (closed)

    Toggle commit list
  • Thanks for the quick fix!

    Interesting how this optimization ended up generating a code-point value larger than 0x10FFFF exactly for the situation I was able to describe in the bug report.

    Fwiw, another option would have been to keep the optimization, and only use it as min(max_char, 0x10FFFF) later, which would be an O(1) cost instead of the new O(length) condition check.

  • Fwiw, another option would have been to keep the optimization, and only use it as min(max_char, 0x10FFFF) later, which would be an O(1) cost instead of the new O(length) condition check.

    The optimization is still in and its complexity the same as before, O(length). The max_char loop is unchanged.

    Ultimately, the only addition was to cap max_char at MAX_UNICODE before calling PyUnicode_New(). See 471966eb

    It took two commits to get it right.

  • Huhu! I didn't realize you actually did that in another commit. That's all I was suggesting! All is good now, thanks.

Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment