ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Azure Outage... Again

    Scheduled Pinned Locked Moved IT Discussion
    microsoftazure
    79 Posts 13 Posters 24.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • wirestyle22W
      wirestyle22 @scottalanmiller
      last edited by

      @scottalanmiller said in Azure Outage... Again:

      @aaronstuder said in Azure Outage... Again:

      @gjacobse That seems like a issue.

      Yes, that's why we think that their loss of subscription data is the core of the issue. Their VMs are dependent on the subscription data but they can't keep their subscription data working.

      How would they have configured this? Wouldn't any of their servers be clustered within multiple data centers? How does this happen with such a huge service?

      scottalanmillerS 1 Reply Last reply Reply Quote 0
      • scottalanmillerS
        scottalanmiller @wirestyle22
        last edited by

        @wirestyle22 said in Azure Outage... Again:

        @scottalanmiller said in Azure Outage... Again:

        @aaronstuder said in Azure Outage... Again:

        @gjacobse That seems like a issue.

        Yes, that's why we think that their loss of subscription data is the core of the issue. Their VMs are dependent on the subscription data but they can't keep their subscription data working.

        How would they have configured this? Wouldn't any of their servers be clustered within multiple data centers? How does this happen with such a huge service?

        They have several known issues in this system. My guess is that they either have another external system that manipulates this one that feeds in bad data and causes outages that way, or that the code of the system that interacts with it has bugs and causes issues that way. The former, I think, is the far more likely based on a few factors - namely that account "type" often affects this. For example, because we are an MS Partner, there have been reports that some partner system has regularly connected to Azure's database and caused it to corrupt.

        No amount of clustering, multiple data centers or keeping servers up can fix this problem in the least. The problem is, from what we've been told, all from their workflows and security. Basically they have an unhealthy, non-working system that is given permission to control Azure and has been known to "randomly" cause Azure to totally fail.

        1 Reply Last reply Reply Quote 1
        • scottalanmillerS
          scottalanmiller
          last edited by

          This is actually a really great example of how platform high availability is so much of a myth. The Azure physical platform can do some amazing HA, but it has incredible fragile dependencies that make the HA features pointless. Who cares if the database is up and running if the data in it gets deleted by some automated process or my careless interns or whatever? Who cares if the application is running if the application itself fails? The high availability just makes people able to see the failed application, it doesn't keep anything working.

          Microsoft's problem here is that their product, Azure, itself is what is failing, not the physical infrastructure or the virtualization layer that it is running on. It's the actual cloud layer, not the hypervisor or physical layer, experiencing the problem. They've made their cloud layer overly complex and with dependencies that they are not keeping as reliable as other things.

          It shows that holistic risk understanding is very important and that the weakest link matters completely.

          1 Reply Last reply Reply Quote 1
          • A
            Alex Sage
            last edited by Alex Sage

            Are you sure the client isn't just forgetting to pay the bill?

            scottalanmillerS 1 Reply Last reply Reply Quote 0
            • scottalanmillerS
              scottalanmiller @Alex Sage
              last edited by

              @aaronstuder said in Azure Outage... Again:

              Are you sure the client isn't just forgetting to buy the bill?

              Not how it works. It's our account and we have partner credits, so even if we were not paying the bill our subscription would not go away. The VMs might turn off, I guess, but the account would not vanish. This is 100% a MS issue and it is a recurring one. There is no question where the issue is.

              1 Reply Last reply Reply Quote 1
              • scottalanmillerS
                scottalanmiller
                last edited by

                We aren't wondering if Azure is down, we know that it is. We know that the issue is Microsoft's and that it is the same issue that they have been having over and over again with many companies (most that we've talked to, actually, it's more than 50% of companies that we've interfaced with report that this exact issue is one that they have experienced and have experienced MS denying it - even to our faces.) What we are asking is how localized is it. Is it just one account (maybe an account manager deleted an account.) Is it regional. Is it people on a single database server or account category?

                1 Reply Last reply Reply Quote 0
                • scottalanmillerS
                  scottalanmiller
                  last edited by

                  MS support responded much more quickly than they stated that they were likely to do and... they need our subscription info to process the ticket.

                  AAARRGGHH

                  1 Reply Last reply Reply Quote 0
                  • Minion QueenM
                    Minion Queen
                    last edited by

                    Well I responded right away when they responded to the ticket. Not that I can give them any information 😛

                    scottalanmillerS 1 Reply Last reply Reply Quote 1
                    • scottalanmillerS
                      scottalanmiller @Minion Queen
                      last edited by

                      @Minion-Queen said in Azure Outage... Again:

                      Well I responded right away when they responded to the ticket. Not that I can give them any information 😛

                      A lot of the issue that we have found in having this happen a lot, is that support is based in India and has a script to follow and, of course, if a script can handle it, they have already automated the fixes. So even though this problem comes up constantly they act like there is no solution and just freeze up. There have no process for dealing with these things, even when they are routine.

                      1 Reply Last reply Reply Quote 1
                      • wrx7mW
                        wrx7m
                        last edited by

                        OK. So...

                        O365 with hosted Exchange - Good idea.
                        Azure - Bad idea.

                        Minion QueenM gjacobseG scottalanmillerS 3 Replies Last reply Reply Quote 3
                        • Minion QueenM
                          Minion Queen @wrx7m
                          last edited by

                          @wrx7m said in Azure Outage... Again:

                          OK. So...

                          O365 with hosted Exchange - Good idea.
                          Azure - Bad idea.

                          YUP!

                          1 Reply Last reply Reply Quote 1
                          • gjacobseG
                            gjacobse @wrx7m
                            last edited by

                            @wrx7m said in Azure Outage... Again:

                            OK. So...

                            O365 with hosted Exchange - Good idea.
                            Azure - Bad idea.

                            That is not to say that O365 doesn't have it's trials and tribulations,.. But it has been the most stable.

                            1 Reply Last reply Reply Quote 2
                            • scottalanmillerS
                              scottalanmiller @wrx7m
                              last edited by

                              @wrx7m said in Azure Outage... Again:

                              OK. So...

                              O365 with hosted Exchange - Good idea.
                              Azure - Bad idea.

                              Consistently, that is what we see.

                              1 Reply Last reply Reply Quote 1
                              • scottalanmillerS
                                scottalanmiller
                                last edited by

                                The important part of that discussion is when you need Exchange, then O365 is a good idea. If you don't demand Exchange, there are great alternatives that cost the same, or less, and are more reliable. But they aren't Exchange.

                                As long as you need Exchange, you are beholden to MS support and capabilities to a significant degree either way.

                                1 Reply Last reply Reply Quote 1
                                • tonyshowoffT
                                  tonyshowoff
                                  last edited by

                                  Why are all of Microsoft's failures colossal ones, such as the whole IMAP situation recently? They seem to take an inordinate amount of time to fix as well.

                                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                                  • scottalanmillerS
                                    scottalanmiller @tonyshowoff
                                    last edited by

                                    @tonyshowoff said in Azure Outage... Again:

                                    Why are all of Microsoft's failures colossal ones, such as the whole IMAP situation recently? They seem to take an inordinate amount of time to fix as well.

                                    Partially because they only make colossal systems.

                                    tonyshowoffT 1 Reply Last reply Reply Quote 0
                                    • tonyshowoffT
                                      tonyshowoff @scottalanmiller
                                      last edited by

                                      @scottalanmiller said in Azure Outage... Again:

                                      @tonyshowoff said in Azure Outage... Again:

                                      Why are all of Microsoft's failures colossal ones, such as the whole IMAP situation recently? They seem to take an inordinate amount of time to fix as well.

                                      Partially because they only make colossal systems.

                                      Naturally, but it doesn't seem as though even larger systems have as much in the way of outages. Not to brag, we aren't nearly as big, but we haven't gone down once since 2007. I bet I've jinxed myself.

                                      DashrenderD scottalanmillerS 2 Replies Last reply Reply Quote 0
                                      • DashrenderD
                                        Dashrender @tonyshowoff
                                        last edited by

                                        @tonyshowoff said in Azure Outage... Again:

                                        @scottalanmiller said in Azure Outage... Again:

                                        @tonyshowoff said in Azure Outage... Again:

                                        Why are all of Microsoft's failures colossal ones, such as the whole IMAP situation recently? They seem to take an inordinate amount of time to fix as well.

                                        Partially because they only make colossal systems.

                                        Naturally, but it doesn't seem as though even larger systems have as much in the way of outages. Not to brag, we aren't nearly as big, but we haven't gone down once since 2007. I bet I've jinxed myself.

                                        See it's this type of issue that SMB's look at and ask - why in the hell would I want to move to the cloud. In 20+ years and around 300 servers worth of support, I think I've lost 4. This is such a small chance of issue that the expense, albeit supposedly better quality setup, extra internet, etc all seem barely worthwhile in the face of outages like these.

                                        I'm going to curse myself as well, but I've had Exchange for 5+ years now, and Domino for 12 years before that and never a full out failure other either hardware or software.

                                        I had tons of clients in the same boat.

                                        just shaking head - not sure what's real anymore 😛

                                        scottalanmillerS 1 Reply Last reply Reply Quote 0
                                        • scottalanmillerS
                                          scottalanmiller @tonyshowoff
                                          last edited by

                                          @tonyshowoff said in Azure Outage... Again:

                                          @scottalanmiller said in Azure Outage... Again:

                                          @tonyshowoff said in Azure Outage... Again:

                                          Why are all of Microsoft's failures colossal ones, such as the whole IMAP situation recently? They seem to take an inordinate amount of time to fix as well.

                                          Partially because they only make colossal systems.

                                          Naturally, but it doesn't seem as though even larger systems have as much in the way of outages. Not to brag, we aren't nearly as big, but we haven't gone down once since 2007. I bet I've jinxed myself.

                                          Oh granted, MS is not on top of these things like their competitors. They are "good" (mostly, not Azure, they couldn't cloud their way out of a plastic bag these days) but they are not "great." They clearly have zero capability to play with the Amazons and Googles of the world.

                                          Partly, and this is huge, MS is a software firm, not an IT one. Running large scale IT is as foreign to them as it is to any random large scale company. And they are hampered by needing to run a lot of it on Windows and Hyper-V which are obviously not the best choices in many cases. Amazon and Google, for example, choose other platforms for a reason. Microsoft has an "eat their own dogfood" problem that limits their choices significantly and limits them to using technology that no other major player would even consider.

                                          1 Reply Last reply Reply Quote 1
                                          • scottalanmillerS
                                            scottalanmiller @Dashrender
                                            last edited by

                                            @Dashrender

                                            I'm going to curse myself as well, but I've had Exchange for 5+ years now, and Domino for 12 years before that and never a full out failure other either hardware or software.

                                            This isn't normal, though. I've worked in many a shop with massive budgets and huge teams and their Exchange and similar systems failed far, far more often than Microsoft does.

                                            And you are talking Exchange, we are not. We are talking Azure. MS Hosted Exchange I've barely seen blip, it's way more stable than any on premises / in house I've ever seen. It's Azure that they can't run to save themselves. If you can run a cloud the size of Azure in house without a failure, let me know.

                                            1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 3 / 4
                                            • First post
                                              Last post