www.comodo.com down???

Not a good day. Has continued to be down frequently when I did periodic checks.

It was down for me too. From 21:45-23:12. I am GMT+1 hours

kail,
When you gave the wireshark log,what URL were you hitting?
Please do mail me your logs. All help gratefully received.

Our monitoring shows 100% availability for forums.comodo.com, so either the monitoring system is part of the conspiracy against us (and I’m not ruling it out) or there is some other variable factor here that I’m stilll not getting.

Blas, sded, what URLs were you hitting? What results /error codes, etc did you get.

Robin

I am hitting http and https forums.comodo.com exclusively. https://forums.comodo.com/comodo_website_issues_for_submitting_website_problems_only-b46.0/ and help for v3, bug reports, unread search. No real error codes, just hangs and eventually times out usually. Have you checked your logs for a DOS attack? The symptoms are similar to others I have seen recently (Broadband Reports had symptoms like this for several days last month and reported it was a DOS attack from the RBN. Also, Wilders has had occasional problems.)

Hi Robin

The 503s were on CFPs Download page & the forums. I’ve sent the PCAPs & all the details are in those I believe. Hope it helps.

sded: You probably will not see any errors (see above posts).

Agree; just see timeouts for the most part. Sure looks like the DOS attacks I mentioned in the previous message.

To be honest, it’s not my area of expertise. Oh dear*… given the symptoms, it’s probably a single server (single point of failure) that performs some small, low-key, action (like the trsuttlogo certs or something) is throwing a hissy fit about something. DOS? I doubt it, I think a DOS attack would have much more impact that what we’re seeing… previous ones I’ve observed both as a user & an BB Admin, have had massive impacts. It took 10 minutes to get from the user prompt to the password prompt! My guess, it is either an intermittent routing problem (config or database) or an unmaintained database somewhere, gently moaning to itself. ;D

*Sorry, I just can’t myself, I just gotta speculate.

The last three or four days it’s ok here from Sweden (at least for me).

Not my area of expertise either, just similar symptoms at BBR previously. RBN attacked gateways and brought down parts of the interface; BBR responded with spare servers that used different URLs and throttled the regular gateways. RBN countered, of course, then … The referenced thread is interesting reading, though. And Robin says the monitoring doesn’t show anything. Just more wild speculation by the uninformed. :slight_smile:

Pretty sure it’s a governmental conspiracy…but can’t say for certain. (:WAV)

Lusher? Botnet from a disgruntled banned user or angry competitor? The RBN? Or maybe just a network configuration and monitoring problem? :THNK

sded,
I agree that a network config problem would give the effect we see.
I am using two separate external 3rd party monitoring services to try to make sure they aren’t part of the (joking) conspiracy.
I’ve just set this one https://secure1.securityspace.com/netmon/report.html?graphID=28122 up to publicly monitor https://forums.comodo.com/comodo_website_issues_for_submitting_website_problems_only-b46.0/ in particular from several locations.
The logging on our routers and webservers would show a DOS, and we don’t see it.

kail, the 503 errors you see interest us. We have a couple of theories we are checking.
The SSLv2 failure you see doesn’t surprise me. Our servers should reject an SSLv2 session as it has a known vulnerability.

Robin

I receive normal timeout errors. “the server is not responding”
There is no error code.

kial,
thanks again for your pcap traces.
I think we are in a position to be able to explain why you are seeing failures. I suspect that elements of this explanation hold for other people seeing problems too.

Working through your traces one at a time:
http_dl_failed -
Lines 1&2 are the DNS query, returning the correct IP for www.personalfirewall.com of 91.199.212.132.
The session then goes on to pull various elements from the webpage (although you haven’t captured the “get” and the reply for the page itself).
Line 74 is where it starts to look strange, because you suddenly hit a new (to this session) IP address of 85.91.228.132. No DNS query has returned this IP address in this session log.
Line 81 sees you doing an HTTP GET of download.comodo.com/download/setups/file_details.js from 85.91.228.132.
Line 121 sees (50 seconds later) the answer from that GET come back with a 503 - service not available.

The problem there is that the IP of download.comodo.com is really 91.199.212.132 (same as www.personalfirewall.com). You are picking up a cached or otherwise out-of-date DNS entry for download.comodo.com. Also, the fact that you get an HTTP 503 error back suggests to us that you are hitting the internet through a transparent proxy (presumably run by your ISP). You hit the proxy for file_details.js, the proxy tries to hit 85.91.228.132 and gets nothing back because those servers have gone. The proxy returns (we think) the 503 error to you.

https_forums_failed:
This is just talking to the wrong IP address from the start. It is talking with 85.91.228.149, whereas those servers (for forums.comodo.com) are now on 91.199.212.149.

https_forums_failed2:
This is interesting because it includes the coloquy between you and your DNS server.
You ask the DNS server at 172.31.140.69 to resolve forums.comodo.com.
You get the answer back that it resolves to 85.91.228.149. Then you try to start an SSL session with 85.91.228.149 but you’re sunk because again the IP address should have been 91.199.212.149.

https_forums_failed3:
ditto

http_forums_failed:
Here your DNS gives the correct IP address for forums.comodo.com (91.199.212.149), but for some reason the connection timed out and the transparent proxy returned a 503 (after 50 seconds).
This one would be worthy of more research, if it was the predominating failure mode.
You can see the monitor at SecuritySpace trying exactly this and succeeding every 5 minutes from 5 separate monitoring locations.

The 85.91.228.* IP addresses are an IP block which our servers were active on 2 or 3 weeks ago.
The 91.199.212.* block is the current one.
Our DNS servers are correctly configured to serve the newer address range.

I think you will see the apparent availability to you of the forums improve when your ISP kick their DNS servers (or maybe the proxy servers).
The fact that you don’t get the problem all the time suggests to us that there may be several Proxy or DNS servers, and only one of them has it’s DNS “stuck” at some point in the past.

Robin

Blas,
Which domain are you getting the timeouts from? Whichever it is, please can you send me the output of “ping forums.comodo.com” and “tracert forums.comodo.com” (assuming that you are seeing the failures with forums.comodo.com).

I don’t expect the ping to succeed and I don’t expect the tract to show you every step in the route, but the information it gives could still be useful.

Thanks.

Hi Robin

Yes, as you know from my email, I realised my DNS wasn’t exactly behaving itself. I’ve passed all this on to my Provider (Hutch 3G, UK). So, thanks for the analysis! :-TU

I’ll switch my primary DNS server to one of the L3 DNS servers (4.2.2.1) and flush the DNS cache and see if that helps.

You were right Robin,

The ISP’s DNS server still tries to direct me to the old IP. It is still strange that sometimes it succeeds other times not (loading forums.comodo.com).
the ping test failed, tracert results are the following:

Útvonal követése a következőhöz: forums.comodo.com [91.199.212.149]
legfeljebb 30 ugrással:

1 1 ms 1 ms 1 ms 192.168.1.99
2 9 ms 8 ms 10 ms portonovo.adsl.interware.hu [195.70.32.11]
3 9 ms 10 ms 9 ms vlan904.core3.interware.hu [217.20.137.37]
4 9 ms 9 ms 9 ms vlan906.core0.interware.hu [217.20.137.49]
5 9 ms 9 ms 9 ms GE-0-0-12.border0.interware.hu [195.70.32.4]
6 9 ms 9 ms 9 ms Gi8-0-0-208.bud-001-access-100.interoute.net [84
.233.170.45]
7 28 ms 26 ms 26 ms Gi5-0-0.prg-001-access-300.interoute.net [212.23
.50.113]
8 25 ms 25 ms 25 ms Gi3-0.prg-001-access-100.interoute.net [84.233.1
38.197]
9 26 ms 25 ms 25 ms Gi4-0.fra-006-core-2.interoute.net [212.23.50.11
0]
10 25 ms 24 ms 25 ms Gi6-0.fra-012-inter-1.interoute.net [212.23.42.1
66]
11 28 ms 27 ms 27 ms ge-0.de-cix.frnkge03.de.bb.gin.ntt.net [80.81.19
2.46]
12 28 ms 28 ms 27 ms xe-1-0-0.r20.frnkge03.de.bb.gin.ntt.net [129.250
.2.148]
13 44 ms 44 ms 43 ms as-0.r22.londen03.uk.bb.gin.ntt.net [129.250.4.1
6]
14 43 ms 43 ms 42 ms xe-4-4.r01.londen03.uk.bb.gin.ntt.net [129.250.2
.66]
15 63 ms 49 ms 59 ms 83.231.181.222
16 47 ms 47 ms 46 ms ge-0-2-0-0.rembrandt.as34270.net [85.91.232.6]
17 58 ms 69 ms 69 ms ge-0-0-0-315.davinci.as34270.net [85.91.224.26]

18 71 ms 75 ms 49 ms no-dns-yet.inetc.co.uk [85.91.232.14]
19 * * * A kérésre nem érkezett válasz a határidőn belül.

20 * * * A kérésre nem érkezett válasz a határidőn belül.

21 * * * A kérésre nem érkezett válasz a határidőn belül.

22 * * * A kérésre nem érkezett válasz a határidőn belül.

23 * * * A kérésre nem érkezett válasz a határidőn belül.

24 * * * A kérésre nem érkezett válasz a határidőn belül.

25 * * * A kérésre nem érkezett válasz a határidőn belül.

26 * * * A kérésre nem érkezett válasz a határidőn belül.

27 * * * A kérésre nem érkezett válasz a határidőn belül.

28 * * * A kérésre nem érkezett válasz a határidőn belül.

29 * * * A kérésre nem érkezett válasz a határidőn belül.

30 * * * A kérésre nem érkezett válasz a határidőn belül.

Az útvonalkövetés elkészült.

My OS is Hungarian, but I think you can still understand the results.

Looks like the server reconfiguration needed a longer “make before break” period for the new DNS assignments to filter out. Maybe an announcement for the next reconfiguration now that we know it will break a number of us? :slight_smile: Thanks; Ed.

So how does the latest total loss of the Comodo online capability relate to the earlier problems, if at all? Was out for ~12 hours, with mostly nothing but finally a late referral to another forum prototype and a BS message to inform the users of nothing. And came up finally with no announcement of anything. Can you post something here on the outage?