diff options
Diffstat (limited to 'content/posts/2004-04-07-party-like-it-s-1992.html')
-rw-r--r-- | content/posts/2004-04-07-party-like-it-s-1992.html | 87 |
1 files changed, 87 insertions, 0 deletions
diff --git a/content/posts/2004-04-07-party-like-it-s-1992.html b/content/posts/2004-04-07-party-like-it-s-1992.html new file mode 100644 index 0000000..427202b --- /dev/null +++ b/content/posts/2004-04-07-party-like-it-s-1992.html @@ -0,0 +1,87 @@ +--- +date: "2004-04-07T21:41:10Z" +title: Party Like It's 1992 +--- + +<p> +I've been using <a +href='http://msdn.microsoft.com/library/default.asp?url=/library/en-us/security/security/cryptography_objects.asp'><acronym +title='Cryptographic API Component Object Model'>CAPICOM</acronym></a> +at work. Since most <acronym +title='Component Object Model'>COM</acronym> objects are supposed to +work with <a href='http://msdn.microsoft.com/vbasic/'><acronym +title='Visual Basic'>VB</acronym></a>, the string values returned by +<acronym title='Component Object Model'>COM</acronym> functions (in my +case <a +href='http://msdn.microsoft.com/library/default.asp?url=/library/en-us/security/security/certificate_export.asp'>CAPICOM::Certificate.Export()</a>) +have some bizarre and baroque semantics when called from C++. One quirk +I found particularly amusing was the memory allocation behind <a +href='http://msdn.microsoft.com/library/default.asp?url=/library/en-us/automat/htm/chap6_7isy.asp'><acronym +title='Binary STRing'>BSTR</acronym></a>s; here's what <a +href='http://blogs.gotdotnet.com/ericli/permalink.aspx/853ae05f-7610-4531-ab1b-070695e61168'>"Eric's +Complete Guide to BSTR Semantics"</a> has to say about what's +happening under the hood for <a +href='http://msdn.microsoft.com/library/default.asp?url=/library/en-us/automat/htm/chap6_7isy.asp'><acronym +title='Binary STRing'>BSTR</acronym></a>s: +</p> + +<blockquote cite='http://blogs.gotdotnet.com/ericli/permalink.aspx/853ae05f-7610-4531-ab1b-070695e61168'> +<p> +COM code uses the BSTR to store a Unicode string, short for "Basic +String". (So called because this method of storing strings was developed +for OLE Automation, which was at the time motivated by the development + of the Visual Basic language engine.) +</p> + +<p>...</p> + +<p> +<ol> +<li>If you write a function which takes an argument of type BSTR then +you are required to accept NULL as a valid BSTR and treat it the same as +a pointer to a zero-length BSTR. COM uses this convention, as does +Visual Basic and VBScript, so if you want to play well with others you +have to obey this convention. If a string variable in VB happens to be +an empty string then VB might pass it as NULL or as a zero-length buffer +-- it is entirely dependent on the internal workings of the VB +program.</li> +<li>BSTRs are always allocated and freed with SysAllocString, SysAllocStringLen, SysFreeString and so on. The underlying memory is cached by the operating system and it is a serious, heap-corrupting error to call "free" or "delete" on a BSTR. Similarly it is also an error to allocate a buffer with "malloc" or "new" and cast it to a BSTR. <u>Internal operating system code makes assumptions about the layout in memory of a BSTR</u> which you should not attempt to simulate.</li> +<li>The number of characters in a BSTR is fixed. A ten-byte BSTR contains five Unicode characters, end of story.</li> +<li> +<p>A BSTR always points to the first valid character in the buffer. +This is not legal:</p> + +<pre> +<code> +BSTR bstrName = SysAllocString(L"John Doe"); +BSTR bstrLast = &bstrName[5]; // ERROR +</code> +</pre> + +<p> +bstrLast is not a legal BSTR +</p> +</li> +</ol> +</p> + +<p>....</p> + +<p> +When you call SysAllocString(L"ABCDE") the operating system actually allocates sixteen bytes. <u>The first four bytes are a 32 bit integer representing the number of valid bytes in the string</u> -- initialized to ten in this case. The next ten bytes belong to the caller and are filled in with the data passed in to the allocator. <u>The final two bytes are filled in with zeros</u>. You are then given a pointer to the data, not to the header. +</p> +</blockquote> + +<p>(Emphasis is mine)</p> + +<p> +Strings with a length prefix <em>and</em> a double-NULL suffix. Now +that's what I call <em>efficient</em> use of memory! Seriously though, +this is like some sort of programming time warp; it reminds me of both +the Pascal-induced single-byte length prefix strings the +<a href='http://developer.apple.com/macos/'>Mac Toolbox</a> +calls used and the associated (and equally wacky) string-conversion +functions. + Ah, history. +</p> + |