Path Yet Another Make Unique Name

If someone has time to convert to long text:

Stolen from: http://www.geocities.com/SiliconValley/4942/paths.html

e.g. to prove it's not spam:

Way back when Microsoft released Internet Explorer 4.0, they bundled with it a number of upgrades to the operating system including a new DLL called SHLWAPI.DLL (Shell Lightweight Utility APIs). That DLL provided programmers with some useful path manipulation functions (amongst other things), but obviously any applications that made use of those functions would also require that the user had installed Internet Explorer.

Of course, now that Microsoft has 'integrated' their web browser into the operating system, the availability of SHLWAPI.DLL is no longer as much of an issue, so this article may seem a bit dated (it's been a while since I've had a chance to update this site). However, if you're still writing code for Windows 95 and NT 4.0, you might find it helpful to know that many of the functions in SHLWAPI.DLL are already available in a regular installation of Windows 95 in SHELL32.DLL - they're just not documented.

Note, however, that there are some subtle differences between the SHLWAPI routines and their corresponding SHELL32 implementations. Pay careful attention to the documentation provided here before trying to use these functions as direct replacements for any SHLWAPI functions with similar sounding names. Also note that, as is usually the case with undocumented functions, all of these are only exported by ordinal.

Combining and Constructing

To start off with, we're going to look at some functions for combining and constructing new paths (see Figure 1). The PathAppend? function (ordinal 36) joins two paths together. lpszPath2 is appended onto the end of lpszPath1, automatically adding or removing backslashes as required. The resulting path is also canonicalized (the '..' and '.' character sequences are compacted out of the path). It is assumed that lpszPath1 is at least MAX_PATH in size. If there is not enough space to perform the operation the function will return NULL. Otherwise it returns a pointer to lpszPath1.

PathCombine? (ordinal 37) may seem similar to Path-Append, but it's considerably more powerful. You get to provide a separate destination buffer, lpszDestPath, so you don't have to overwrite your source data. Also, the function doesn't necessarily append lpszSrcPath2 onto the end of lpszSrcPath1. If lpszSrcPath2 starts with a backslash it is appended to the root of lpszSrcPath1. If lpszSrcPath2 is a full path (with drive specification or UNC) it overwrites lpszSrcPath1 altogether. The function returns NULL if there is not enough space. If the operation is successful it returns a pointer to lpszDestPath.

  LPSTR WINAPI PathAppend?(
    LPSTR lpszPath1,
    LPCSTR lpszPath2);

LPSTR WINAPI PathCombine?( LPSTR lpszDestPath, LPCSTR lpszSrcPath1, LPCSTR lpszSrcPath2);

LPSTR WINAPI PathAddBackslash?( LPSTR lpszPath);

LPSTR WINAPI PathBuildRoot?( LPSTR lpszPath, int drive);

Figure 1 Path Construction Functions

The PathAddBackslash? function (ordinal 32) simply adds a backslash to the end of the specified path if there isn't already one there. The return value points to the end of the path, i.e. the position after the backslash. If there isn't enough space to add a backslash (again assuming a size of MAX_PATH) the function returns NULL. PathBuildRoot? (ordinal 30) is even simpler - it builds a new path of the form X:\ in the specified buffer. The second parameter, drive, specifies the drive letter (0 for drive A, 1 for drive B, etc.). The function returns a pointer to lpszPath. Extracting Component Parts The next couple of functions can be used to access the various component parts of a pathname (see Figure 2). PathFindFileName? (ordinal 34) returns a pointer to the start of the filename in the specified path. In the event that the path is a simple filename, it just returns a pointer to the path itself. PathFindExtension? and PathGetExtension? (ordinals 31 and 158) both return a pointer to the file extension of the specified filename, but PathGetExtension? doesn't include the dot as part of the extension. Both functions return a blank string if the filename has no extension.

  LPSTR WINAPI PathFindFileName?(
    LPCSTR lpszPath);

LPSTR WINAPI PathFindExtension?( LPCSTR lpszPath);

LPSTR WINAPI PathGetExtension?( LPCSTR lpszPath);

LPSTR WINAPI PathGetArgs?( LPCSTR lpszPath);

int WINAPI PathGetDriveNumber?( LPCSTR lpszPath);

BOOL WINAPI PathRemoveFileSpec?( LPSTR lpszPath);

Figure 2 Component Extraction Functions

PathGetArgs? (ordinal 52) searches for any command line arguments in the specified path. The arguments are considered to be anything following the first unquoted space. If there are no arguments the function returns a blank string. PathGetDriveNumber? (ordinal 57) returns the drive number of the specified path (0 for drive A, 1 for drive B, etc.). If there is no drive letter the function returns -1. PathRemoveFileSpec? (ordinal 35) removes the filename from the end of a path along with the trailing backslash (unless it's a root path). It returns TRUE if something was removed, or FALSE if there was no filename.

  void WINAPI PathStripPath?(
    LPWSTR lpszPath);

BOOL WINAPI PathStripToRoot?( LPWSTR lpszPath);

void WINAPI PathRemoveArgs?( LPWSTR lpszPath);

void WINAPI PathRemoveExtension?( LPWSTR lpszPath);

Figure 3 NT Component Extraction Functions

On Windows NT there are a couple more of these functions (see Figure 3). PathStripPath? (ordinal 38) removes the path from the beginning of a filename. PathStripToRoot? (ordinal 50) strips characters from the end of a path until only a root drive, directory, or UNC network share remains. If the path is relative (i.e. it has no root), the function returns FALSE. PathRemoveArgs? (ordinal 251) removes any command line arguments from the end of the specified path. PathRemoveExtension? (ordinal 250) removes the file extension from the end of the specified filename. More Path Manipulations Now for a couple more functions that can be useful when manipulating paths (see Figure 4). First off we have PathGetShortPath? (ordinal 92), but it doesn't really provide any more functionality than the documented function GetShortPathName?. The only advantage is that instead of having to provide a buffer for the result, the long pathname is overwritten with the converted short form of the name. If the operation is successful, the function returns a pointer to the path - if it fails, the return value is NULL.

For neatening up paths, we have PathRemoveBlanks? (ordinal 33), which removes any leading and trailing spaces from the specified path; PathQuoteSpaces? (ordinal 55), which encloses the specified path in double quotes if it contains any spaces; and PathUnquoteSpaces? (ordinal 56), which removes any enclosing double quotes from a path.

The last function in this section, PathParseIconLocation?

  LPSTR WINAPI PathGetShortPath?(
    LPSTR lpszPath);

void WINAPI PathRemoveBlanks?( LPSTR lpszPath);

void WINAPI PathQuoteSpaces?( LPSTR lpszPath);

void WINAPI PathUnquoteSpaces?( LPSTR lpszPath);

int WINAPI PathParseIconLocation?( LPWSTR lpszPath);

Figure 4 Path Manipulation Functions

is only available on Windows NT (the ordinal number is 249). It simply takes a string specifying an icon location (of the form 'PathName?,IconIndex?') and separates the pathname from the icon index. The pathname is copied back over the input string, and the return value contains the icon index. If the specified string didn't contain a comma, the icon index that is returned is zero.

Path Testing

The next couple of functions (see Figure 5) allow you to perform various tests on paths. PathIsUNC (ordinal 39) basically just checks whether the path starts with two backslashes (i.e. it is a UNC pathname). PathIsRelative? (ordinal 40) makes sure the path doesn't start with a backslash or a drive specification (i.e. it's a relative path). PathIsRoot? (ordinal 29) returns TRUE if the specified path is nothing more than a root path specification (i.e. one of the following formats: X:\, \, or \\SERVER\SHARE).

PathIsExe? (ordinal 43) returns TRUE if the specified path has one of the following file extensions: BAT, PID, EXE or COM. The NT implementation also considers CMD to be an executable extension. The PathIsDirectory? function (ordinal 159) determines whether the path is a directory (i.e. the FILE_ATTRIBUTE_DIRECTORY attribute is set). PathFileExists? (ordinal 45) just checks whether the path exists.

  BOOL WINAPI PathIsUNC(
    LPCSTR lpszPath);

BOOL WINAPI PathIsRelative?( LPCSTR lpszPath);

BOOL WINAPI PathIsRoot?( LPCSTR lpszPath);

BOOL WINAPI PathIsExe?( LPCSTR lpszPath);

BOOL WINAPI PathIsDirectory?( LPCSTR lpszPath);

BOOL WINAPI PathFileExists?( LPCSTR lpszPath);

BOOL WINAPI PathMatchSpec?( LPCSTR lpszPath, LPCSTR lpszSpec);

BOOL WINAPI PathIsSameRoot?( LPCWSTR lpszPath1, LPCWSTR lpszPath2);

Figure 5 Testing Functions

PathMatchSpec? (ordinal 46) determines whether the path in the lpszPath parameter matches the search specification in the lpszSpec parameter. The search specification supports both the question mark and asterisk wildcards and there can be multiple occurences of each wildcard. For backwards compatibility, the *.* specification matches all paths regardless of whether they contain a dot or not. Mutiple alternate search specifications can be supported by separating each specification with a semicolon.

The last function, PathIsSameRoot? (ordinal 650), is another function that is only supported on Windows NT. It checks whether the two specified paths share the same root (i.e. they're on the same drive or the same UNC network share).

Creating Something Unique

There are two functions for creating unique pathnames with more or less the same parameters (see Figure 6). The lpszPathName parameter specifies the path to the folder where the new file will be created. The lpszShortName parameter specifies the preferred filename to use if the destination drive doesn't support long filenames. lpszLongName specifies the preferred long filename - if it is NULL the short filename will always be used. If lpszShortName is NULL, the name is extracted from the end of the lpszPathName parameter (this only applies to PathYetAnotherMakeUniqueName though).

With both functions, a decimal number from 1 to 999 is added to the name in order to make it unique (actually with PathYetAnotherMakeUniqueName, when using the long filename, the first number tried is 2 rather than 1). When the short filename is used, any trailing digits are removed from the filename before adding the number. When the long filename is used, the function looks for the first set of parentheses containing nothing but digits and inserts the number there. If the name contains no parentheses, it will add some to the end of the name.

  BOOL WINAPI PathMakeUniqueName?(
    LPSTR lpszBuffer,
    DWORD dwBuffSize,
    LPCSTR lpszShortName,
    LPCSTR lpszLongName,
    LPCSTR lpszPathName);

BOOL WINAPI PathYetAnotherMakeUniqueName( LPSTR lpszBuffer, LPCSTR lpszPathName, LPCSTR lpszShortName, LPCSTR lpszLongName);

Figure 6 Functions for Creating Unique Paths

Note, however, that with PathYetAnotherMakeUniqueName a unique number won't always be added to the name. The function will first attempt to use the specified name as is, and will only try to add a unique number if that filename doesn't already exist. In the case of the long filename, if there is a set of parentheses in the name to mark the insertion position for the unique number, those parentheses will first be removed before checking the name (the parentheses must be empty though - they can't contain any digits).

Once a unique filename has been found, the resulting name is placed in the buffer pointed to by lpszBuffer and the function returns TRUE. With PathMakeUniqueName?, though, only the filename is returned in the lpszBuffer parameter - PathYetAnotherMakeUniqueName returns the full path. If a unique name can not be found, the function returns FALSE. The ordinal number for PathMakeUniqueName? is 47; PathYetAnotherMakeUniqueName has an ordinal of 75.

Cleaning and Resolving Paths

This next group of functions is fairly complicated, so the explanations are going to be quite lengthy (see Figure 7 for the function specifications). The simplest one is probably PathFindOnPath? (ordinal 145). It will search for the specified file in each of the directories included in the alpszPaths array (the array is terminated with a NULL pointer). If the file can't be found there, it will also check the windows directory, the system directory (but not system32), and every path in the PATH environment variable. If the file is found, the lpszFile parameter is replaced with the full path to the file and the function returns TRUE. If the file can't be found, the function returns FALSE.

  BOOL WINAPI PathFindOnPath?(
    LPSTR lpszFile,
    LPCSTR *alpszPaths);

DWORD WINAPI PathCleanupSpec?( LPCSTR lpszPath, LPSTR lpszFile);

void WINAPI PathQualify?( LPSTR lpszPath);

BOOL WINAPI PathResolve?( LPSTR lpszPath, LPCSTR *alpszPaths, DWORD dwFlags);

int WINAPI PathProcessCommand?( LPCWSTR lpszPath, LPWSTR lpszBuff, DWORD dwBuffSize, DWORD dwFlags);

Figure 7 Cleaning and Resolving Functions

PathCleanupSpec? (ordinal 171) is also fairly straightforward. It takes the filename specified in lpszFile and replaces or removes any invalid characters. Spaces, forward slashes and fullstops (when invalid) are replaced with a dash; other invalid characters are removed. It uses the lpszPath parameter to determine whether the filename is intended for a filesystem that supports long filenames or not (so it knows which characters are valid). On a filesystem that only supports short filenames it will also truncate the name into an 8.3 format and convert all characters to uppercase. The return value shows what operations were performed on the file- name to clean it up (see Figure 8).

  PCS_REPLACEDCHARS 0x00000001
  PCS_REMOVEDCHARS 0x00000002
  PCS_SHORTENED 0x00000004
  PCS_PATHTOOLONG 0x80000008

Figure 8 PathCleanupSpec? Return Values

PathQualify? (ordinal 49) takes the lpszPath parameter and tries to turn it into a valid, fully qualified path. Forward slashes are changed to backslashes (well sometimes - there's a bug in the implementation); if there is no drive specification the windows drive will be added to the front; the '..' and '.' character sequences are compacted out of the path; and if it ends with a backslash and isn't a root path, the backslash will be removed. If it is on a drive that doesn't support long filenames, any invalid characters will be replaced with underscores, any part of the path that is longer than 8.3 will be truncated, and any part with more than one file extension will have all but the last removed. Finally, if the path ends with a dot the dot is also removed.

  PRF_CHECKEXISTANCE 0x01
  PRF_EXECUTABLE 0x02
  PRF_QUALIFYONPATH 0x04
  PRF_WINDOWS31 0x08

Figure 9 PathResolve? Flags

PathResolve? (ordinal 51) is similar to the PathQualify? function, although it is more oriented towards creating a path that actually exists. The first step in this process is to qualify the path (as with PathQualify?) unless the given string is a root directory, a root UNC sharename, or a simple filename with no path. When qualifying the path, if PRF_QUALIFYONPATH is specified (see Figure 9 for the list of flags), the first path from the alpszPaths array is used when adding a missing drive specifier rather than the windows drive. Also, if the path in lpszPath is missing a leading backslash, the full path from alpszPaths is added to the front rather than just the drive specifier.

At this stage, unless lpszPath is a simple filename (which we'll deal with later), it can be considered to be a fully qualified and valid pathname (although it may not actually exist). If the PRF_CHECKEXISTANCE flag is specified, though, the function will first check if the file exists and only return TRUE if it does. If the file can't be found and the filename has no extension, the function will also try appending various executable extensions when looking for a match (namely PIF, COM, EXE, BAT, LNK as well as CMD on Windows NT). If the PRF_WINDOWS31 flag is specified, only the first four will be checked though. In any event the lpszPath parameter is replaced with the updated path.

As for the case when lpszPath is a simple filename: the only way to obtain a fully qualified path is to actually search for the file regardless of whether PRF_CHECKEXISTANCE is specified. The function first looks for the file in each of the directories specified in the alpszPaths array - if that fails, it also checks the windows directory, the system directory (but not system32), and every path in the PATH environment variable. If the filename has not extension and either the PRF_CHECKEXISTANCE flag or the PRF_EXECUTABLE flag is specified, the function will also try appending the various executable extensions mentioned above. The function returns TRUE if it eventually finds the file and, of course, the lpszPath parameter will be replaced with the updated path.

  PPCF_QUOTEPATH 0x01
  PPCF_INCLUDEARGS 0x02
  PPCF_NODIRECTORIES 0x10
  PPCF_DONTRESOLVE 0x20
  PPCF_PATHISRELATIVE 0x40

Figure 10 PathProcessCommand? Flags

The last function in this group is PathProcessCommand? (ordinal 653), but it is only available on Windows NT. Its purpose is to take a command line (consisting of a path- name, possibly followed by a number of command line arguments), and clean it up in a number of ways. The first step is to resolve the path using PathResolve? with PRF_ CHECKEXISTANCE and PRF_EXECUTABLE. This step is only performed if the path is relative, or the PPCF_PATHISRELATIVE flag is specified (see Figure 10 for the list of flags). You can also force the resolve to be skipped by specifying PPCF_DONTRESOLVE, but this flag should not be specified if the orginal pathname isn't quoted since there is a bug in the implementation that corrupts the resulting path.

When the pathname isn't enclosed in quotes the function attempts to separate the path from the command line arguments by considering each space in the string as the end of the pathname and the checking whether the proceeding characters make up a valid filename (resolving the filename if necessary, as explained above). During this process, the function would usually also consider a directory name to be a valid filename, but you can force matching directory names to be ignored by specifying the PPCF_NODIRECTORIES flag.

Once the path has been extracted (and successfully resolved if necessary), all that remains is to decide what needs to be copied into the output buffer. If you want the path enclosed in quotes you should specify the PPCF_ QUOTEPATH flag (however, this flag only has any effect if the path contains spaces). Specifying PPCF_QUOTEPATH will also append the command line parameters to the result buffer, but if you haven't specified PPCF_QUOTEPATH you can force the command line parameters to be included by specifying PPCF_INCLUDEPARMS.

The lpszBuff parameter specifies the buffer where the result should be copied, and dwBuffSize specifies the size of the buffer (in characters). If the function is successful it returns the length of result (including the null terminator). If the function fails for any reason it will return -1. If the lpszBuff parameter is NULL the function will still return a valid length (assuming it is successful), which allows you to determine the exact size of the buffer that will be required.

One Last Function

  BOOL WINAPI PathSetDlgItemPath?(
    HWND hDlg,
    int nIDDlgItem,
    LPCSTR lpszPath);

Figure 11 PathSetDlgItemPath?

The last path function that I'm going to describe doesn't really fit into any of catagories discussed above. PathSetDlgItemPath? (ordinal 48) is useful if you need to display a filename in a dialog that won't necessarily have enough space to accomodate the full path (see Figure 11 for the function specification). hDlg is the handle of the dialog, nIDDlgItem is the ID of the dialog item where the filename will be displayed, and lpszPath is the actual filename. If there isn't enough space to display the path in full, some of the characters in the name will be replaced with ellipses. The function returns TRUE if the dialog was updated successfuly or FALSE if the function failed.

Windows NT Issues

As I've mentioned in previous articles, undocumented functions on Windows NT usually expect string parameters to be in Unicode whereas Windows 95 expects ANSI strings. So, anywhere you see an LPSTR or LPCSTR in these function specifications, it should actually be considered a LPWSTR or LPCWSTR when your application is running on NT. Also, as has been mentioned above, a few of the path functions are only available on NT. If your application must run on both Windows 95 and NT, you'll need to check which OS is being used and only make use of the NT-specific functions if they're available (i.e. you'll need to link to them with GetProcAddress?).


EditText of this page (last edited May 2, 2009) or FindPage with title or text search