Mono.Posix 1.0.5000.0 2.0.0.0 4.0.0.0 This type is safe for multithreaded operations. System.Text.Encoding

A Unix filename .

Unix filenames are an interesting construct, as there is no encoding. The operating system kernel only maintains a sequence of bytes for the filename, with no encoding implied. This makes it non-trivial (or impossible) to determine what encoding a filename is in -- it could be UTF-8, ASCII, Shift-JIS, or some binary data inserted by a freak touch(1) accident (try touch "$(printf "test\xffname")" within a bash(1) shell for an example). On the other hand, developers and users expect filenames to be strings, and the type is a UTF-16 encoded string. This consequently requires that all filesystem byte sequences be converted into some UTF-16 encoded string so that files can be used sensibly. All filenames strings provided to/from the and types are passed through UnixEncoding. UnixEncoding does the following: When unmarshaling a filename from unmanaged to managed code (such as with ), UnixEncoding will first try to decode the string as a UTF-8 string. If the UTF-8 decode fails, any "invalid" characters will be represented as the sequence of followed by the "offending" byte cast to a char. When marshaling a filename from managed to unmanaged code (such as via or ), the filename will be encoded using UTF-8 unless is encountered, in which case the EscapeByte character will be skipped and the following character will be marshaled as a byte. The upshot to all this is that Mono.Unix and Mono.Unix.Native can list, access, and open all files on your filesystem, regardless of encoding. The downside is that all such support is only within the Mono.Unix and Mono.Unix.Native namespaces. You won't be able to pass non-Unicode filenames as command-line arguments. In short, it's a Glorious Hack. Rejoice. Or something. What this means: Any filename on disk, in any encoding (or lack thereof), can be found and used with the Mono.Unix and Mono.Unix.Native types. You don't need to specify the encoding of filenames (which could be wrong anyway, since a directory may contain files in more than one encoding). Printing or otherwise saving/displaying the filename may be incorrect, since it contains extra escaping that's relevant only to the Mono.Unix and Mono.Unix.Native classes. I'm not losing any sleep over this, because if the encoding is unknown the strings couldn't be displayed correctly anyway... You may not be able to use the classes to use a file obtained via Mono.Unix and Mono.Unix.Native classes. This is because System.IO doesn't know about UnixEncoding and the escape mechanism it uses. I don't consider this to be a problem, as the System.IO classes couldn't open these files anyway -- they weren't returned by , and they were effectively invisible to normal Mono programs. They still are. If the filename contains Mono.Unix.UnixEncoding.EscapeByte, then you won't be able to use System.IO with that file. If the filename doesn't contain EscapeByte, it can be used with System.IO. You still can't specify filenames in arbitrary encodings on the mono command line. Mono will still try to decode these as either UTF-8 strings or as an encoding listed in the MONO_EXTERNAL_ENCODINGS environment variable. Questions & Answers Q Why UTF-8? Why not use ? A Because UTF-8 is sane and should always be used. :-) Q Seriously? A Ha ha only serious. Plus, since a directory can contain files in more than one encoding, and expecting the developer to provide the right encoding for each file would require the developer to be clairvoyant. Plus, using UTF-8 allows any Unicode character to be used in a filename (which could be considered as a bad thing, depending). Q What is ? A U+0000. Since this is the terminating null, it by definition cannot appear within a Unix filename, so it's a sane choice. Q Why not use byte[] instead of s for filenames in , , etc.? A Because byte[] is fugly to work with, so it would need to be offered in addition to the string versions, which would double all the file-related APIs. Do you really want to explain the difference between these APIs?


public static int open (string pathname, OpenFlags flags);
public static int open (byte[] pathname, OpenFlags flags);

(Hint: if you do want to explain the difference between these you're masochistic.) Furthermore, what should be (or , or any other string-typed structure member)? If it's a byte[], developers will still need a way to convert it to a string for debugging and display to the user, but the developer can't know what encoding to use (it could be anything), so this becomes an impossible problem. UnixEncoding may be a Glorious Hack, but at least it leaves the API usage unambiguous. Q .NET doesn't have these limitations! Why does Mono? A Because Windows stores all filenames on disk as Unicode (and has since Windows NT 3.1 and/or the introduction of Long Filenames in Windows 95), so it doesn't need to worry (as much) about the arbitrary filename encoding problem. Short filenames might be in a local encoding, but CIFS uses Unicode, so you can't be accessing non-Unicode filenames over a network share. Q Why doesn't Mono do this (or something like it) so that System.IO can read and process all files? A Priorities. :-) Plus, I thought it would be easy for Mono to do this, but after implementing this type I'm not sure the other maintainers would wish to deal with the issues of arbitrary filename encodings. Plus, most current Linux distros default to using UTF-8 already, so (hopefully) this won't be an issue for too much longer (10 years?). Constructor 1.0.5000.0 2.0.0.0 4.0.0.0

Constructs a new instance of the class.

Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Boolean To be added.

To be added.

To be added. To be added. Field 1.0.5000.0 2.0.0.0 4.0.0.0 System.Char

The character which precedes all characters which need escaping during managed->unmanaged marshaling.

This character is U+0000. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added. To be added. To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Byte[] To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added. To be added. To be added. To be added. To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added. To be added. To be added. To be added. To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added. To be added. To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added. To be added. To be added. To be added. To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Text.Decoder

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Text.Encoder

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Int32 To be added.

To be added.

To be added. To be added. Method 1.0.5000.0 2.0.0.0 4.0.0.0 System.Byte[]

To be added.

To be added. To be added. Field 1.0.5000.0 2.0.0.0 4.0.0.0 System.Text.Encoding

A default instance.

This member can be used instead of constructing a new instance.