{"id":800,"date":"2023-04-18T21:35:32","date_gmt":"2023-04-18T19:35:32","guid":{"rendered":"https:\/\/codevision.net.pl\/?p=800"},"modified":"2023-05-07T19:28:26","modified_gmt":"2023-05-07T17:28:26","slug":"beware-of-char-sign-and-unsigned-comparison-portable-code","status":"publish","type":"post","link":"https:\/\/codevision.net.pl\/index.php\/2023\/04\/18\/beware-of-char-sign-and-unsigned-comparison-portable-code\/","title":{"rendered":"Beware of char and its traps"},"content":{"rendered":"\n<p>Recently I was involved in testing of the API, which was working since years already, but required some refactoring for future use-cases. The API is used in the automotive ECU, powered by ARM processors. Nothing really special or difficult, but&#8230;<\/p>\n\n\n\n<p>The API was using a data and its SHA-256 digest, both represented by <code>std::string<\/code>. Such an input was then validated internally by calculating the SHA-256 of data and comparing both hashes. Calculated hash was represented by <code>std::vector&lt;uint8_t><\/code>. Surprisingly, produced test cases were failing, even if the content was exactly the same!<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; gutter: false; highlight: [4]; title: ; notranslate\" title=\"\">\nstd::string user_input = &quot;\\x9f\\x68\\x12\\x0a&quot;;\nstd::vector&lt;uint8_t&gt; calculated_input {0x9f, 0x68, 0x12, 0x0a};\n\nassert(user_input.size() == calculated_input.size());\nassert(std::equal(std::begin(user_input), std::end(user_input), std::begin(calculated_input)));\n<\/pre><\/div>\n\n\n<p>As the API was surely working, my suspicions turned towards the test code. But after checking it dozen of times, I couldn&#8217;t find anything wrong. Suddenly I realized, that test code is compiled and executed in development enviromnent under x86, while the API is used in armv8 target. The <code>std::string<\/code> is storing and manipulating the sequence of character-like objects defined by character traits. Whether the underlying <code>char<\/code> (or its variant) type is signed or unsigned, depends on the platform and compiler. In most x86 GNU\/Linux and Microsoft systems the <code>char<\/code> type is signed. For ARM of PowerPC in turn, it is rather unsigned. The trace was confirmed, after running the test cases in the target.<\/p>\n\n\n\n<p>So what really happened? When comparing the <code>std::string<\/code> with <code>std::vector&lt;uint8_t><\/code> content there is a comparison of (signed) <code>char<\/code> with <code>uint8_t<\/code> value. In such case the compiler converts signed value to unsigned and performs the comparison of unsigned values. If any value of individual character in <code>std::string<\/code> exceeds the range of <code>signed char<\/code> (here: 127 = 0x7F), then it is treated as negative value (overflow). During the comparison with <code>uint8_t<\/code> value it is casted to some huge unsigned number. In effect the <code>std::equal<\/code> fails with comparison.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; gutter: false; title: ; notranslate\" title=\"\">\nsigned char x = 127;\nstd::cout &lt;&lt; &quot;x = &quot; &lt;&lt; (int)x &lt;&lt; std::endl;           \/\/ x = 127\nsigned char y = 128;\nstd::cout &lt;&lt; &quot;y = &quot; &lt;&lt; (int)y &lt;&lt; std::endl;           \/\/ y = -128\nunsigned int z = y;\nstd::cout &lt;&lt; &quot;z = &quot; &lt;&lt; (unsigned int)z &lt;&lt; std::endl;  \/\/ z = 4294967168\n<\/pre><\/div>\n\n\n<p>How to fix the code and make it working, independent of the processor architecture? Seems that using a simple predicate solves the issue and quarantees that compared values have correct types:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; gutter: false; title: ; notranslate\" title=\"\">\nassert(std::equal(std::begin(user_input), std::end(user_input), std::begin(calculated_input), &#x5B;](uint8_t lhs, uint8_t rhs) { return lhs == rhs; }));\n<\/pre><\/div>\n\n\n<p><strong>As a conclusion<\/strong>: beware of signed and unsigned value comparison, have in mind potential overflows and care about code portability.<\/p>\n\n\n\n<p> <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recently I was involved in testing of the API, which was working since years already, but required some refactoring for future use-cases. The API is used in the automotive ECU, powered by ARM processors. Nothing really special or difficult, but&#8230; The API was using a data and its SHA-256 digest, both represented by std::string. Such [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"zakra_sidebar_layout":"customizer","zakra_remove_content_margin":false,"zakra_sidebar":"customizer","zakra_transparent_header":"customizer","zakra_logo":0,"zakra_main_header_style":"default","zakra_menu_item_color":"","zakra_menu_item_hover_color":"","zakra_menu_item_active_color":"","zakra_menu_active_style":"","zakra_page_header":true,"_eb_attr":"","om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[17,16],"tags":[22,8,7],"class_list":["post-800","post","type-post","status-publish","format-standard","hentry","category-c","category-development","tag-c-2","tag-development","tag-software"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/posts\/800","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/comments?post=800"}],"version-history":[{"count":12,"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/posts\/800\/revisions"}],"predecessor-version":[{"id":835,"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/posts\/800\/revisions\/835"}],"wp:attachment":[{"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/media?parent=800"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/categories?post=800"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/codevision.net.pl\/index.php\/wp-json\/wp\/v2\/tags?post=800"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}