We enjoy sharing best practices/framework to validate the health of Snowflake implementation and discuss best practices to help you ensure you’re on the right track in order to unlock additional business value and technical improvement opportunities when it comes to bringing your data into Snowflake for a single source of truth.
Check out our Meetup video below to watch an overview of this event:
Other Meetups and Events to check out:
Transcript of the Meetup video:
Hello everyone. Welcome to another awesome Carolina Snowflake Meetup event. Today we will be going over Snowflake health checks. Super exciting topic here. These are our virtual meetup rules. Just that we’re all respectful and on mute and using the chat to ask questions.
That way we can keep everybody in the loop and answer some questions that other people might have as well. And we try to keep all of the questions on topic. Video is optional and we always hope that you’ll recommend or invite a friend and hit us up on Twitter if you want to reach out.
Our previous meet ups you can find on our YouTube channel which is posted in the discussion board on our Meetup group. If you’d like to go back and check any of those out and we usually get them up after a few days.
Today we are going to have a quick Snowflake refresher as usual and we will discuss our Snowflake health checks and do a demo and we will talk a little bit about the symptoms that you would want to use a health check for and then we’ll go over our upcoming event.
Just a quick introduction for those that might be new to Snowflake. Always welcome, excited to have you and excited for your journey that you’re about to take or have maybe recently taken as well. And for those that are more seasoned veterans, you understand the power and the goodness with Snowflake.
Again, those that are new to Snowflake or maybe considering going to Snowflake, what is Snowflake? You hear about the data cloud. I’ll just think of that’s the means to ingest your data, whether that’s in another cloud source on premise, third party flat file.
Or a combination of all of the above and really centralizing that data, governing that data, controlling the access, but then also distributing that data. Perhaps there’s an anonymised version of the data to distribute through a data share to an outside party as you look to start to potentially monetize your data or just make it easier for your internal and external customers to consume your data.
And naturally, as your needs go up for whether it’s processing power, storage power, the use cases from data science to an end user, you can scale that as you see fit. So lots of different potential use cases for Snowflake data cloud, we talked about some of those different types of sources, whether that’s transactional, enterprise applications, just even all the way through IoT and pulling that data in.
At times you may transform that data, you may normalize that data, you may aggregate that data, or a combination of all the above relative to one of those different consumers within your organization.
And sometimes those consumers are real humans, and other times they are actually applications that will be consuming the data as well. Anapa, would you like to introduce our Snowflake health checks to everyone?
So the idea behind health check is to kind of have some form of mechanism that would give you a framework that given us no break, and then how do you know what the health of the system is? So if you’re you had a login to a Snowflake system, how do you know if the Snowflakes is kind of healthy or not?
Now, having said that, more than any specific technology or tool, it’s more of a framework and philosophy as to how you look at. Health checks, but you could actually divide the entire concept of health checks into three major components.
You could have like a dashboard where you could monitor your Snowflake instance and the dashboard is usually supplemented with some sort of alerting mechanism. And then after you have the alert mechanism in place, you would have some form of diagnostics that would indicate that given the information I give from the dashboard, which gives me a real time state of Snowflake and when you match it with the alerts, what’s the total score or the health of my Snowflake instance.
So here’s more to the next slide. Right, this is what this entire explanation kind of boils down to. I think for the dashboard we have a very quick couple of minutes demo at the end where we will show a kind of task for building a data lake house from a healthy standpoint.
But once you have a task to up and working, you could have various forms of alerts that would work with alerting the users based on your heading. So for example, if something has gone down or something is not as expected, you could have like a slack message being set up or some form of email that goes out or some form of push notification that goes into you, maybe your SMS or something where you could kind of alert the user or the admin and so on.
And at the end of the day, you could kind of come up with a scoring mechanism which will indicate give your Snowflake system a score on a base 100 or ten or whatever it is, and then kind of know what it looks like.
Moving on to the next part now, you could again have two major components of head check. One is more of a process alerting standpoint that we just spoke about writing now, but keeping that away for now and just focusing on the technical side of things.
If you had to evaluate a system on Snowflake, you could actually divide them into four major buckets. So the first bucket could be to kind of monitor general account parameters in Snowflake, and then the second could be kind of monitoring users.
So it could basically be managing users, user authorization, user roads, grants and so on, and then log in management and so on. And then you could look at Snowflake objects. So you would basically look at your warehouses, what the warehouse execution looks like, how many warehouses are running, what’s the size of warehouse?
Is there any warehouse running on extra large? And it’s not required on extra large, what’s the database and the cost savings? Is the cost of storage exponentially growing? Is spiraling? What’s required?
You would look at your time travel details. Do you have time travel set at 90? Are you making the most out of the time travel options in Snowflake? And then you would look at tasks. Do you have any longer tasks which are essentially suspended or the kind of failing and so on.
And at the end, I think we need to come to cost and usage, which basically is again, your warehouse costs and your storage costs just to make sure that your cost is managed under your budget on a monthly or yearly basis.
So, coming to that, as I said, we have a quick demo on dashboard that we currently building. And this diagram kind of gives an architecture of how this dashboard kind of works from an architecture standpoint.
So essentially what we’ve done is we’ve built a streamlined UI that essentially talks to like a fast API web server at the backend. And then we’ve deployed each of these as container images to Google cloud run and we kind of executing this streamlit UI on the cloud run itself.
So what we didn’t see this in the demo, but in principle what the streamlit does is that it first kind of gets to get the OAuth beta token to ensure that every call to the Fast API is done with a beer token and it’s authorized once the operation is done.
It basically does a Snowflake connection test to ensure that the Snowflake parameters that were logging in with are kind of correct. And then after that it goes and fetches the account parameters that’s required for us to perform the dashboard and the sofa head check and then it does user management to get user information and then it goes and fetches the so called objects.
So I think I’ll quickly move on to the demo which kind of complements this diagram here. There’s a question I think on the. On the Q and A streamlit free. Do you want to take that one, Mike, or on a POV?
I can start maybe yes. So streamlit without the cloud version, when you use the streamlit libraries in Python, that’s completely free and you can kind of use that to host on a different cloud. We’ve honestly not used a stream clip loud to kind of comment on that.
But instead what we’ve done is we’ve used the streamlined libraries and then hosted that into our GCP cloud. So that’s completely kind of free and there is no licensing fee to pay for that. Yep, that’s not only part I would add is if folks have additional questions, want to do a deeper dive and say, well, how can I get set up?
We’re happy to. Have those conversations with you. So feel free to hang us outside of the session here and chat with you. Okay, so what you see right now is actually the streamlined UI. If you look at the link, it’s been deployed to our GCP cloud run.
And let me just refresh this refresh is let me also kind of come back here. So this is essentially nothing, but this is a Fast API server that’s kind of running on the background. So if you see, we’ve also deployed the Fast API Hilljack web server on the cloud run.
And basically, if you look at the code here, what’s happening is that you have these endpoints. So these are basically your API endpoints that are called by the streamline internally. And what’s the advantage of having Fast API is that it’s completely in python.
So it’s quick. And unlike the other web servers, like Flask and the other ones out there, this is based on the Asynchronous method. So, for example, you see the Asynch definition. It takes the advantage of Asynchronous execution to kind of speed up things.
That’s one advantage. But the other advantage is that it also kind of lets you have dependencies. So, for example, we specify the dependency that, hey, before you call any of the functions, make sure that you run this particular method where it ensures that the Snowflake connections are tested.
And at the same time, before we run each of these Snowflake connections, we make sure that we do the internal auth check. And if any of the dependents are not matched, those respective methods that we have defined in Fast API are not called.
So that’s a couple of good things. But first, there’s a bit of a learning curve to start, but once you get used to it, it’s is pretty quick, pretty nice, and it’s kind of easy to build on so far. Ubicon server.
So you kind of execute the Uvicon server. This is just for testing that we put on local host. But at the end we created like a docker image and then we hosted the docker image on the cloud run itself.
So you could kind of do a local test on your local host and if, if you’re okay with it, you can go and deploy it on the cloud. Now, coming back to the actual streamlit UI, this is streamlining UI is another docker image that’s posted on the cloud.
So see what happens when I do a quick refresh of the page. If you look at the top, every action on the stream, it kind of ensures that we have the bearer token operation done before any of the action is performed.
So once I do a click on the health check, what it does is it does that over the nation again and it’s kind of running to start the process and it kind of starts calculating the parameter when you calculate.
This is some fancy graphics input, but if you come down, you see we are trying to divide our parameters into three major pockets. So we’ve got some key matrix here, we’ve got some activity and usage and cost.
And this is the progress bar, which kind of keeps a track of progress. This is still a work in progress. We are probably around 30% to 40% there compared to where we plan to be in coming days. So a lot of matrix are yet to be added, but you can still see the principles that’s out there.
So based on the Airgram that I just explained, this UI hit the fast API. It fetched the time travel day. So that’s a problem. This system sorry, I forgot to talk about these parameters. These are essentially the Snowflake parameters.
So that’s the Snowflake account ID, where you got to specify the Snowflake account ID. And then you log in with the username and password. It’s important that you log into a system that has. Any user as account admin.
So that ensures that you have access to all objects. So once you log in to log in to a particular Snowflake account with the username password to a performance check, it also checks that the Snowflake connection is successful.
So it makes a call to the fast API and then calls the endpoints a Snowflake connection and then ensure if it’s gone through after that, if it’s gone through, it’ll go on to kind of generate the details.
For us. If you look at the first matrix, that’s a problem. So Snowflake gives you an option of 90 day backup, but in this particular case, it’s not being configured. So you’ve got to go and manage to configure turn that on, either at the account level or you’ve got to go turn that on for each table and schema.
But for us, for this particular account, it’s not yet turned on. So that’s one of the key priorities we indicated. We’ve kind of shown the number of active users. We also kind of plan to put some form of indication on what this percentage is.
Is it 80%, 70% and so on. It gives you that this particular system has a default time zone is medical Los Angeles. And that’s a good thing to know because a lot of times no place is connecting with systems like AWS GCP, where data comes in UTC, and then you want to make sure that your time zone is matching with the timeline from source.
So it’s kind of one shop space to see that you have your time zone set as Los Angeles is required. You can kind of go ahead and then make it as UDC. Before I go ahead, I think a couple of questions on the chat.
Yes. Let’s see here. How was the front end written? So this is using streamlit libraries and python. So this is completely in streamlit and python libraries and Python. And then she would also like to know how extremely compared with Grafana and what’s the advantage of.
Dashboard? That’s a good question. To be honest, I’m not really an exploring Grafana, so that’s a kind of difficult question for me to answer. But in case if you have Mike or Christian who want to kind of take a stab at this, they can.
Okay, so what I do with maybe not talk about Grafana, but I could quickly talk about some of the advantages that the second part of the question of Streamlit is. So, for example, if you happen to be a company that’s completely a Python shop, right?
So you’ve got, let’s say that you have a lot of scripts and code, which is running in Python. You could have machine learning scripts, machine learning code, which is running completely in Python. And you don’t want to kind of spend your time to kind of build a UI in JavaScript or any of the UI tools to kind of see what the output looks like.
StreamLink gives you an option to quickly use Python libraries and have a UI up and running within the Python framework itself. So one good thing is I did not have to go or code anything else outside Python to have this end to end functionality.
So picking up data from Snowflake looking for the output looks like I could do completely on Python itself. And now you’ve got, I think recently there’s been some form of partnership between Streamlit and Snowflake.
So it’s kind of become easier to integrate streamlit as part of the Snowflake ecosystem also range as well. So if you look at the entire direction where Snowflake is going, Snowflake is trying to integrate Python as part of Snowball and the other libraries.
So I think it’s part of the entire ecosystem where you have Snowflake as part of the back end, you have Python as a processing engine, and then you have streamlined, which again, in Python as part of the same ecosystem without stepping out.
So it kind of takes down the time of development. So for example, it did not take me more than a couple of days to have this entire UI up and running. And then it took me halfway to kind of go and deploy it on cloud run.
We had like a UI which is up in two days pretty quick. And the other good thing is it may not be as powerful as JavaScript or the other wise, but most of the requirements that you would have from a simple UI up and running that is definitely kind of satisfied by stream.
For example, I did not have to do anything that I needed from a UI that is currently not available on Steam it. But I think we may have some other people on our team that are familiar with Grafana. So maybe what we can do is once we get a little bit of background on Grafana from them, we can probably add something to the discussion board in our group.
That way we can fully answer your question later on. Perfect. So since I’ve done a refresh, I can just go ahead and do a file check again. So it’s running in the background gets. Me all the I love the Snow.
It’s a nice touch. It takes your focus away from the time the processing happens. So that’s one good effect. You also have balloons. You can actually do balloons if you want that. Anyways, coming back to this, so if you go back to the second tab, as you see, we’ve got other information.
So for example, we’ve picked up all those users that have account app don’t have MSA anymore. So it gives you a quick list here. It tells you that I’m right query in Snowflake. Then we’ve got all the users that are not assigned default rules.
It’s good to have good practice to have all the users assigned to because Snowflake is a role based access control system, right? So good practice package is to make sure that instead of looking at the users you look at at the role level.
So it’s good to have all users map to. Roles in Snowflake. So if you don’t have any default road map to them, may it may not be a deal, it may not be like a deal killer, but it is a warning that you need to have default.
So that’s good to have. And then you see that more than approximately 70% users have not not reset passwords within 90 days. So I would definitely recommend you to kind of have your users reset the password for 90 days and so on.
So a lot of things are to be added here that you’re right now adding it and it’s kind of being deployed. It’s not yet deployed to cloud, but we plan to add more parameters here which make it easy to kind of evaluate the health of the system from a user’s point.
But if I go to users and costs, you see where we want to execution time. So I just click on it. It tells you that by looking at the warehouse, it shows that in the last 30 days I have had these warehouses which have run the maximum on my system.
So just like we were execution time, we plan to have things like a database storage, storage that’s been in the system for, let’s say for the last one month. And then we also plan to add something on credit usage, that was the credit usage you’ve had for the last month and so on.
So as I said, we are still in the early stages of this dashboard. We plan to add a few things and there is one thing that I think is not working right now, but that’s the download data. So what this would do is that as soon as you click on the download data, it’s going to download a CSV file on your system, which would have all those tables details in one single workbook that you can open and see.
Which might not make sense to have it on the UI, given we don’t out of the UI too much. But if you want to see the details just to download data and give you all the details that you want to see with what’s happening and what’s going on.
Sorry to interrupt you jocelyn has another question she’s asking if Streamlit is like Dash and Python and what’s the advantage of Streamlit over Dash? Yes, so it is very similar to Dash. Dash has its own advantage that it is very closely related with Plotly.
And if you go to Plotly, plotly is kind of connected with Dash. You basically run a server on Dash Two so you could do a lot of things on Dash also where you just spring up a server and then you kind of start running it.
And so you could have Dash, you could have tlask, you could have Streamlit, you could have all of these up and running so that they have their own different flavors and functionalities. I am not very sure if you can get these kind of UI elements in Dash just so you can get it in Stream.
I know Dash is very password. You can have your plots, you can have your tables, you can have all of those which you can do on Plotly on Dash also. But I would be surprised that I would like to be corrected if you can have all of these on Dash also.
But having said that, if Dash meet the requirement, no harm go for it. But as I said, there is a conscious effort from Snowflake to have streamlined as part of its app. So if you go to Streamlit nowadays, if you go to Snowflake, they’re talking about Snowflake Marketplace where they’re looking at having Snowflake end to end apps where Snowflake is a packet and Streamlit is a UI being kind of exposed on the marketplace.
So having said that, you would expect a lot of developments, a lot of support from the Snowflake ecosystem Streamlit in future and that’s why we’ve to kind of keep us constrained with extremely but there is no harm in moving to Dash also if that meets our requirements.
Just like we run streamlit as a server you run, as a server you deploy. But it just matters if it’s able to satisfy all these UI requirements, that stream that can do it for you. So here, was there anything else?
Can I answer the question? Yes, I believe so. Thank you. On the palace, this slides just showing the symptoms that this can help diagnose. So misconfiguration, spikes, any kind of neglected housekeeping errors, failures, timeouts, different issues with usage, that kind of thing.
All right, so for those of you who do not know, we are AICG, developers of the DailyLakeHouse platform. And if you have any interest in DataLakeHouse.io, you can check it out at daily healthcare.
If you’re interested in a health check, I am also going to put it the link in the chat for you guys. And DataLakeHouse.io is a data warehousing as a service platform, we offer ELT with no code whatsoever.
There’s machine learning used. We have analytics, pre built data models and dashboard visualization. Pretty cool platform, if I do say so myself. I recommend checking it out. And here is a little bit of details on our Snowflake health checks if you’re interested in booking one.
Does anybody else have any other questions that we didn’t get a chance to answer? If so, please stick them in the Q and A and we’ll try to answer anything that comes up. Alright? Well, if any questions do come up, you can always reach out to us on Twitter, or you can reach out in the discussion board in the meetup group, or when the YouTube comes up in the comments section there as well.
Alright, our next event will be this week on Wednesday, which is a little different for us at 04:00 Eastern instead of our typical Monday night at six Eastern. And we will be going over bringing Ceridian Dayforce data into Snowflake.
Those invites were sent out through the Meetup group, so they should have come directly into your inbox. If anybody did not receive an invite that would like one, please drop your email in the chat to one of us and we will send you the invite information.
All right, that is all we have for you this evening, but we hope to see you on Wednesday. Go over how you can get that Ceridian Dayforce data into your Snowflake. Thank you all so much and we hope you have a great night.