Anti-virus scan doesn't recognnise NTFS directory symbolic links

Hello

I had some Windows directory symbolic links defined (or possibly directory juncitons) in Windows 7. The anti-virus scan recursed into the folder. If I’d left it running it would never have finished since the folder it traversed has the symlinks.

To make the symbolic links, I used mklink /d or /j. You should ignore symbolic links for a full scan. There is an attribute that tells you whether a shell item is a symbolic link.

CIS version 4.1.150349.920, 32-bit.

Regards

Praful

That would be true with a defective arrangement of links.

Does it matter ?

Surely if the A.V. would recurse indefinitely as it follows the links,
then would not the CMD.EXE DOS command “DIR /S” also recurse indefinitely,
in fact I doubt whether Windows would ever get out of the start-up phase.

Alan

Hi Alan

Thanks for replying.

Yes, it matters and no, the links are not defective.

If the AV behaviour matches DIR /S that doesn’t prove anything: both could be defective (or not).

Ask yourself this: would you ever want the AV to loop indefinitely? I assume the answer is “no”. Now ask yourself, “how would you avoid it doing that?”

Given that Comodo AV said a full scan was required, it should have done a full scan and not have been waylaid by symbolic links which can legitimately loop back on themselves.

As it happens, the “DIR /S” behaviour is defecive or, at least, incomplete. That may be because symlinks, even though they’ve been around for about 10 years in NT, are not incorporated into all commands. The Linux equivalent of DIR is “ls”, which does let you treat symlinks differently. That may be because symlinks have been around a long time in *nix OSs.

The Windows (GUI) equivalent of “DIR /S” (roughly speaking) is to view the properties of a folder in Windows Explorer. When you do this, you get no infinite recursion because Explorer doesn’t traverse symlinks. It’s smart.

A command line example of a Windows program that recognises symlinks is ROBOCOPY. You can avoid the problem of infinite recursion by using the /XJ switch, which excludes junction points. It’s not surprising that ROBOCOPY has this feature because it looks like it’s based on rsync, an old *nix program.

I don’t understand your comment about Windows not starting up. If you mean Windows won’t start if you have symlinks defined that result in infinte recursion, then you’re wrong because I have that setup and Windows 7 starts fine. It’s unlikely that Windows traverses the whole directory structure on startup.

I’m not sure if you’re a programmer but this is programmatically trivial to solve. No programmer worth their salt would think it acceptible behaviour.

Praful

I was not saying the links are defective, but the arrangement seems defective.

I may have misunderstood you.

It is possible to move a sequence of folder from C:\Music\1\2\3\4 to D:\C\Music\1\2\3\4
and to use a reparse point so that it looks as if they are still present and accessible from C:
You could have more music at E:\D\a\b\c
Is it not possible to add a reparse point at D:\C\Music\1\2\3\4 to point at the music on E:
so that looking at the contents of C:\ it would appear to hold C:\Music\1\2\3\4\a\b\c.
I also envisage using a reparse point to result in the appearance of C:\Music\1\2\3\4\2\3\4
but if the reparse point placed at D:\C\Music\1\2\3\4 was to designate 3 folders back, i.e. D:\C\Music\1\2\3\4,
then the recursive view from C: would be
C:\Music\1\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4\2\3\4 etc.
That is the situation I understood you to have,
and it seemed to me that there is no reason to have a recursive arrangement,
If you have played the music which is in C:\Music\1\2\3, will it sound different from C:\Music\1\2\3\4\2\3 ?
That is why I thought it a defective arrangement - I see no merit.

When I first learnt about reparse points I thought about the things I could do with them.
I thought about putting a reparse point at the end of C:\1\2\3\4 which linked back to C:\1,
so that I had the never ending sequence 1\2\3\4\1\2\3\4\1\2\3\4
My experience with Win XP Home edition is that it starts up much slower if the external drive is switched on, and the LEDS indicate it is being read from before I am allowed to logon.
I decided that the Operating system likes to know everything that is on the disks before it lets me log in,
and if it had the opportunity to recurse through an infinite loop it would never end.

I chose not to experiment because I anticipated the need to dust of my Acronis Boot CD to restore my system.

Windows is more clever than I realised if it will recognise and cleanly exit from an infinite loop.
Windows normally surprises me in the other direction ! !

For myself I would expect an A.V. scan to travel through the reparse points and detect any viruses that may be present in the applications that have been repositioned.

I tested before I implemented, and was satisfied that :-
PerfectDisk would still defrag drive C:\ and not try to walk through the reparse point and also work on the next partition;
Acronis would exclude from its image of C:\ anything that was actually moved to the next partition.
Everything else I tested was fooled by the reparse point into thinking those moved things were still on C:\

I did notice two anomalies with Windows Explorer.
With Program Files selected in the left hand tree I could click select all and all the application folders on the right were selected, and a right click told me
Size 1.36 GB (1,462,392,410 bytes)
When I deselected the Hewlett Packard folder a right click on the remaining group of folders still gave me
Size 1.36 GB (1,462,392,410 bytes)
Windows Explorer obviously did not look inside the HP folder when measuring the size of the group,
but when I selected only the HP folder a right click showed it held
Size 186 MB (195,362,179 bytes) - so that did not appear as part of the total - not what I expected but I can see why they do it that way.
There is a strange difference in the Size on Disk of the HP folder.
Regardless of whether Program Files is in the left hand tree and I have selected the HP folder,
or alternatively clicked through the reparse point so the contents are on the right and all selected,
Size on Disc 149 MB (156,483,817 bytes)
But the real location on Partition D:\ I am told
Size on Disc 192 MB (201,392,128 bytes)
All three sets of properties show the same quantities of files, folders, and bytes holding data,
but the drive D:\ reality of Size on Disc is not seen by two of the measurements.
I suspect Windows Explorer gets a bit confused when it is peeking through a reparse point.

Now that is the sort of foolishness I expect from Windows ! !

Alan

I doubt if Windows scans whole drives (internal or external) when starting. Having written software to traverse drives, I know how time consuming it can be to walk a drive. My guess is that it checks for the presence of drives. I find that NAS drives and USB drives wake up when I come out of hibernation, which indicates that Windows is, at least, checking for their presence.

I didn’t follow your example of the defective arrangement of symlinks. However, a program is better if its developer assumes as little as possible about the user’s configuration. In this case it’s easy. The general algorithm would be:

fetch list of local drives
for each drive recurse folder
if a folder is a symlink, ignore folder and move to next folder
otherwise process folder (check for viruses in each file in folder)

And that’s about it!

Windows XP didn’t let you create symlinks to network drives. Windows 7 does. I think this was introduced in Vista. At work, I have a symlink to a 7TB drive. This would mean that Comodo AV starts checking that! Fortunately, at work, we use Trend AV, which doesn’t suffer from the Comodo infinite recursion problem.

Anyway, I’ve now uninstalled Comodo because it wasn’t letting me access my NAS drives at home even though they were in my safe network zone list. This was working but stopped after Windows upgraded my WiFi dirver. It could be some incompatibility between them. That’s on my laptop running Win 7 32-bit. On my desktop, Win 7 64-bit, Comodo works fine although it has the AV bug.

For my laptop, I’ve reverted back to Win 7’s firewall and am trying out the MS AV program.

Praful

At start-up Windows does not read the contents of 60,000 files on the external drive,
BUT when the external drive is connected Windows takes an extra 20 seconds at start-up,
and the LED on that drive shows continuous access for 20 seconds,
and I assume that it has been looking at the information which shows what files and folders will be available when required. That is my hypothesis, but I welcome alternate explanations.

I am aware of various flavours of reparse points. I recognise Symlinks as being in that group.
I am aware that other members of this group have different characteristics.
I believe some will only redirect individual files,
and some will handle a complete folder plus all its sub-folder and file contents.
Some are restricted to the same partition,
and some can redirect to a different partition.
I use Folder Junctions and thought I knew how that works
I do not remember all the other names
and I do not remember the capabilities of Symlinks.

I could start with several music albums in a folder sequence
C:\MY MUSIC\ROCK\CRASH\■■■■\WALLOP
After WALLOP I can add a Folder Junction that would connect to \SCREECHING\MONKEYS
and then C:\MY MUSIC would seem to contain ROCK\CRASH\■■■■\WALLOP\SCREECHING\MONKEYS.
Alternatively I could add a Folder Junction that would connect to C:\MY MUSIC\ROCK
and then I would expect to find the album WALLOP twice, i.e. at what appears to be
C:\MY MUSIC\ROCK\CRASH\■■■■\WALLOP
and also
C:\MY MUSIC\ROCK\CRASH\■■■■\WALLOP\ROCK\CRASH\■■■■\WALLOP
In fact I would expect an indefinite sequence of albums as it repeatedly runs through the same Folder Junction, limited only by any O.S. restrictions on maximum path length.

That is the situation I envisaged by your complaint that the A.V. scan would never end because it was always looping back on itself.
I can understand that some people might like to play the album WALLOP,
but why would they want the choice of two or more identical WALLOPS ?
That seems to me to be illogical and caused me to say it is a defective arrangement of links.
I see no benefit to it.

I am willing to learn.
Please explain where I am going wrong. Why would a symlink want to point back at itself ?

I believe W7 and VISTA have symlinks for compatibility with old applications that hold their settings at
"%systemdrive%\Documents and Settings%username%\Application Data
and W7 etc puts at the expected position a symlink that points to the new path
"%systemdrive%\Users%username%\Application Data
In my view that does NOT involve infinite recursive rentry, it will pass through the symlink only once.

If CIS were to honour your request “You should ignore symbolic links for a full scan”,
would not that involve ignoring any virus held by the older generation of applications ?

I certainly feel happy in the belief that when CIS scans C:
it not only looks at all the files under “C:\Program Files”,
but it also goes through the folder junction placed at "C:\Program Files" and scans what I have repositioned on Drive D:, e.g. an Open Office installation.

Regards
Alan

Hi Alan

I wasn’t questioning your knowledge of reparse points/symlinks, etc. I’m sure you understand them. I should have made clear that when I said symlinks I meant “directory symbolic links” as opposed to hard links or directory junctions. Collectively, below, I’ll call them “links”.

The exact example is not important although below I provide it. All Comodo have to do is avoid traversing links. If you start at the root of a drive and recursively traverse it (and avoid links), you will visit every folder in the tree. That includes old apps because the links are an alternative way of getting to them not the only way. To prove this, use this recursive algorithm (in pseudo-code) on a subset of your drive:


function recurse(top_folder)
  for each item in top_folder
    if item is link then
      continue with next item //just skip links: we'll get to them another way!
    else if item is folder then
      recurse(folder)  //we're going a level down the folder tree here 
    else //we have a file!
      scan file
    end if
  end for //this means we've finished with this item and we want the next one (if it exists)
end function

Now for the arrangement of links that I have. With Windows 7, you can create a taskbar folder that behaves like QuickLaunch in XP. When I click on that folder the contents of that folder pop up. That folder is in c:\data\QuickLaunch. Typically in that folder are a load of program shortcuts. These supplement the shortcuts I’ve placed on the Windows 7 taskbar. As I created this QuickLaunch folder, I thought wouldn’t it be cool if I could have a shortcut to a folder in it. Then when I click on QuickLaunch and place my mouse over the shortcut to the folder, the contents of that folder would appear - and similarly with folders within that folder shortcut. To do this, the shortcut couldn’t be a standard Windows shortcut (it wouldn’t expand when the mouse was over it). However, it can be a symlink (created using mklink /d). That means I can put any number of directory symlinks in c:\data\QuickLaunch and when I mouse over them in the taskbar, they would expand. I could get to any file anywhere with one click (well two: one to expand the QuickLaunch folder from the taskbar and the other to open the file/program). Now one of the symlinks I have in c:\data\QuickLaunch is to c:\data because in c:\data I have a load of other folders and docs (I treat it like My Documents except I had it before Microsoft had even thought of My Documents - a long time ago!). So now you see the circularity.

I could move QuickLaunch out of c:\data e.g. to c:\QuickLaunch. That solves the circulariy problem but it’s still inefficient. In QuickLaunch, I also have a symlink to c:\apps. Now there is no circularity there but it means that Comodo AV scans c:\apps twice: once naturally (when traversing [i]c:[/i]) and then again when traversing c:\data\QuickLaunch. Not fatal but wasteful and therefore inefficient and therefore to be avoided if you like writing smart programs!

This may sound convoluted or not depending on your experience. There may even be other, possibly better, ways of achieving the functionality I desired. But from the programmer’s perspective it’s irrelevant because the right algorithm sidesteps this pitfall.

As it happens, I installed Microsoft’s AV program and it worked!

Praful

Thank you for your patience and explanations.

After your explanation I now appreciate that Comodo will scan every file on the drive C:\ exactly once if it avoids traversing links to other regions on the drive C:. I have only used links to access applications that were installed on “C:\Program Files” and have since been moved to a different partition and accessed via a link, and in this case the only way to scan them from C:\ is via the links. I now appreciate that in your case scanning through the links would duplicate the scanning of those files and be a waste of time.

My head hurts ! ! !

I concede, you have valid reasons for links which allow circularity which DIR /S may see as an infinite series,
and unfortunately Comodo also sees them as any infinite series, and that would be avoided if Comodo saw the world like Windows Explorer instead of like DIR /S.

I never thought Windows could be that clever,
especially after my experience before Win XP when every morning Windows would scold me for not shutting down properly, and every evening it would just sit there sulking when I told it to shutdown, and I had to pull the mains plug before the janitor locked up for the night ! !

Regards
Alan