[arin-tech-discuss] FW: [arin-ppml] Just so it is recorded here (DNSSEC.. ) outages today..

Nate Davis ndavis at arin.net
Wed Mar 9 13:20:48 EST 2016


On 3/9/16, 11:34 AM, "Christopher Morrow" <christopher.morrow at gmail.com>
wrote:

>Thanks!
>(I have a few questions, which may not be answerable here, I suppose..
>if they can be answered that'd be cool though)
>
>On Tue, Mar 8, 2016 at 12:59 PM, Nate Davis <ndavis at arin.net> wrote:
>>
>> ARIN's DNS process moves DNS data from the internal database to a
>>Secure64
>> DNSSEC appliance to a hidden distribution master. From the hidden
>> distribution
>> master, zones are fetched to name server constellations from ARIN,
>> VeriSign, and PCH.
>>
>> About two weeks ago a script was run that reset the serial on a zone in
>> the database. This script was run to accommodate an inter-RIR network
>
>This script sounds like something that should/would happen
>periodically? (whenever there's an xfer I guess?) is that correct?
>
>> transfer, and is not executed during the normal course of operations. It
>> reset the serial in our database in an unexpected way, and consequently
>> zone transfers from the Secure64 to our distribution master did not
>>occur.
>>
>
>'unexpected way' was decreased the serial? made it a string not an
>integer? other?
>(ie: Can I dork up my zone by setting the serial in the same fashion?
>what should I look for?)
>
>> This script was cumbersome and error prone, and had already been
>> identified to be replaced in the upcoming, planned deployment this
>>weekend.
>>
>
>neat, ok.
>
>> This incident exposed a gap in our monitoring that we are fixing. Our
>
>is/was the gap: "Make sure serial is monotonically increasing"
>or is/was it: "If you are going to backup the serial, be sure to force
>a reload on all masters via process X"
>
>(ie: If I make a serial change, what other things should I look for?
>what monitoring gap do I also have?)
>
>> current, legacy monitoring system does not adequately identify the
>>serial
>> number inconsistencies between the DNS nodes, nor does it adequately
>> handle issues with DNSSEC signature validation. We have work underway to
>> replace our old monitoring system with a new system that solves these
>> problems.
>
>The legacy/current system should be doing the moral equivalane of:
>  for s in $(dig +short NS zone); do
>    dig SOA +short zone @${s}
>  done
>
>and make sure that all servers agree that the serial/soa is the same...
>right?
>Was there other verification that was happening? (or not)
>is the above too naive? should we be looking for other things?
>
>For dnssec I suppose you'd be doing the above but pulling rrsig for
>the SOA and making sure they are all the same.
>
>> This update is being posted to both arin-ppml and arin-tech-discuss.  To
>> avoid non-policy related discussion on PPML, we encourage follow up
>> discussion
>> on arin-tech-discuss, a public mailing list that ARIN¹s engineering team
>> monitors.  For those not
>> familiar with arin-tech-discuss, please subscribe here:
>> http://lists.arin.net/mailman/listinfo/arin-tech-discuss
>>
>
>oh :)
>
>> Regards,
>>
>> Nate Davis
>>
>>
>> On 3/8/16, 11:05 AM, "arin-ppml-bounces at arin.net on behalf of Chris
>> Woodfield" <arin-ppml-bounces at arin.net on behalf of chris at semihuman.com>
>> wrote:
>>
>>>Agreed with Chris¹ sentiment. I¹m a firm believer in the blameless
>>>post-mortem particularly when paired with action items to avoid repeat
>>>occurrences, and I¹d hope that others can learn from the technical
>>>issues
>>>involved.
>>>
>>>On top of that, everyone loves a good war story :)
>>>
>>>Thanks,
>>>
>>>-C
>>>
>>>> On Mar 8, 2016, at 7:45 AM, Christopher Morrow
>>>><christopher.morrow at gmail.com> wrote:
>>>>
>>>> Also, i'd be super awesome if there would be a pretty detailed
>>>> post-mortem document published about what happened, how it happened
>>>> and how it was discovered/repaired.
>>>>
>>>> I believe ARIN isn't the only one having these issues, so publishing
>>>> so other folk can learn would be great!
>>>>
>>>> -crhis
>>>>
>>>> On Mon, Mar 7, 2016 at 10:28 PM,  <frnkblk at iname.com> wrote:
>>>>> Nate,
>>>>>
>>>>> Please let us know if ARIN monitors all their zones for DNSSEC
>>>>>signature
>>>>> expiration.
>>>>>
>>>>> Frank
>>>>>
>>>>> -----Original Message-----
>>>>> From: arin-ppml-bounces at arin.net [mailto:arin-ppml-bounces at arin.net]
>>>>>On
>>>>> Behalf Of Nate Davis
>>>>> Sent: Monday, March 07, 2016 7:59 PM
>>>>> To: Michael Peddemors <michael at linuxmagic.com>; arin-ppml at arin.net
>>>>> Subject: Re: [arin-ppml] Just so it is recorded here (DNSSEC.. )
>>>>>outages
>>>>> today..
>>>>>
>>>>> Michael - thanks for reporting the issue.
>>>>>
>>>>> ARIN Engineering resolved the DNSSEC failure shortly after you
>>>>>reported
>>>>> the issue. They are currently looking into the cause of the failure.
>>>>>All
>>>>> DNSSEC functions should be operating properly at this time.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Nate Davis
>>>>> Chief Operating Officer
>>>>> American Registry for Internet Numbers
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 3/7/16, 6:14 PM, "arin-ppml-bounces at arin.net on behalf of Michael
>>>>> Peddemors" <arin-ppml-bounces at arin.net on behalf of
>>>>> michael at linuxmagic.com> wrote:
>>>>>
>>>>>> We had a flurry of reports from various customers, problems with
>>>>>>reverse
>>>>>> DNS lookups..
>>>>>>
>>>>>> Limited to the 65/8 IPv4, and from apparent reports, related to a
>>>>>> failure to update a DNSSEC signature..
>>>>>>
>>>>>> Reported: Anyone with a DNSSEC enforced name server will have
>>>>>>problems
>>>>>> with PTR queries for that range.
>>>>>>
>>>>>> Someone with more inside knowledge can provide more details, I am
>>>>>>sure..
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> "Catch the Magic of Linux..."
>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>--
>>>>>>-
>>>>>> Michael Peddemors, President/CEO LinuxMagic Inc.
>>>>>> Visit us at http://www.linuxmagic.com @linuxmagic
>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>--
>>>>>>-
>>>>>> A Wizard IT Company - For More Info http://www.wizard.ca
>>>>>> "LinuxMagic" a Registered TradeMark of Wizard Tower TechnoServices
>>>>>>Ltd.
>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>--
>>>>>>-
>>>>>> 604-682-0300 Beautiful British Columbia, Canada
>>>>>>
>>>>>> This email and any electronic data contained are confidential and
>>>>>>intended
>>>>>> solely for the use of the individual or entity to which they are
>>>>>> addressed.
>>>>>> Please note that any views or opinions presented in this email are
>>>>>>solely
>>>>>> those of the author and are not intended to represent those of the
>>>>>> company.
>>>>>>
>>>>>> _______________________________________________
>>>>>> PPML
>>>>>> You are receiving this message because you are subscribed to
>>>>>> the ARIN Public Policy Mailing List (ARIN-PPML at arin.net).
>>>>>> Unsubscribe or manage your mailing list subscription at:
>>>>>> http://lists.arin.net/mailman/listinfo/arin-ppml
>>>>>> Please contact info at arin.net if you experience any issues.
>>>>>
>>>>> _______________________________________________
>>>>> PPML
>>>>> You are receiving this message because you are subscribed to
>>>>> the ARIN Public Policy Mailing List (ARIN-PPML at arin.net).
>>>>> Unsubscribe or manage your mailing list subscription at:
>>>>> http://lists.arin.net/mailman/listinfo/arin-ppml
>>>>> Please contact info at arin.net if you experience any issues.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> PPML
>>>>> You are receiving this message because you are subscribed to
>>>>> the ARIN Public Policy Mailing List (ARIN-PPML at arin.net).
>>>>> Unsubscribe or manage your mailing list subscription at:
>>>>> http://lists.arin.net/mailman/listinfo/arin-ppml
>>>>> Please contact info at arin.net if you experience any issues.
>>>> _______________________________________________
>>>> PPML
>>>> You are receiving this message because you are subscribed to
>>>> the ARIN Public Policy Mailing List (ARIN-PPML at arin.net).
>>>> Unsubscribe or manage your mailing list subscription at:
>>>> http://lists.arin.net/mailman/listinfo/arin-ppml
>>>> Please contact info at arin.net if you experience any issues.
>>>>
>>>
>>>_______________________________________________
>>>PPML
>>>You are receiving this message because you are subscribed to
>>>the ARIN Public Policy Mailing List (ARIN-PPML at arin.net).
>>>Unsubscribe or manage your mailing list subscription at:
>>>http://lists.arin.net/mailman/listinfo/arin-ppml
>>>Please contact info at arin.net if you experience any issues.
>>



More information about the arin-tech-discuss mailing list