• mreid
    robotblake: am now...
  • robotblake
    Ah, well I figured out the main_summary issue I think, it's running the fix now
  • mreid
    cool, thanks for checking into it!
  • mreid
    robotblake: what was it?
  • robotblake
    Was diffing the columns wrong :\
  • jason
    robotblake: thanks for handling 👍
  • mreid
    robotblake: still not quite right... Error running query: HIVE_PARTITION_SCHEMA_MISMATCH: There is a mismatch between the table and partition schemas. The column 'e10s_multi_processes' in table 'telemetry.main_summary' is declared as type 'bigint', but partition 'submission_date_s3=20171116/sample_id=99' declared column 'e10s_cohort' as type 'string'.
  • robotblake
    :|
  • robotblake
    Wait, wtf
  • robotblake
    Did we change the type of that?
  • robotblake
    I currently handle adding new columns and new fields to structs
  • robotblake
    Gr
  • robotblake
    mreid: Do you have a link to all the affected queries?
  • robotblake
    I'm wondering if I should just change them to use Presto
  • robotblake
    Guessing it's all the Athena queries from the s&i dashboard
  • mreid
    robotblake: that e10s_cohort field was removed, sounds like that might be the cause?
  • robotblake
    That would do it, I don't think it supports removing columns at all, when did that happen?
  • frank
    robotblake: Athena doesn't support removing columns? but it supports adding them?
  • mreid
    schema evolution is a lie :-/
  • frank
    maybe we need to accelerate our plans to stop using Athena
  • frank
    there's too many other good solutions out there
  • mreid
    yeah, maybe a spark sql data source for the short term?
  • frank
    how well can we get an autoscaling EMR spark cluster?
  • frank
  • frank
    looks like it autoscales by looking at cloudwatch at 5 minute intervals
  • mreid
    5 minutes seems kinda slow
  • frank
    agreed, but nothing compared to our bootstrap -_-
  • mreid
    true
Last Message: an hour ago