PinnedAutomating Business Insights using GenAILLM models are advancing their capabilities day by day. Better models, new versions of existing models with improved capabilities, and…Nov 8, 2024Nov 8, 2024
PinnedAnalysis of S3 API cost for Data LakeOne of the main advantages of Data Lake over traditional Data warehouse is the separation of storage and compute.Due to this decoupling we…Nov 2, 2022Nov 2, 2022
PinnedOpenMetadata — Data Catalog that worksA well functioning Data Catalog or Metadata Store with accurate and up-to date technical/business metadata is a dream of everybody in the…Nov 15, 2022Nov 15, 2022
Airflow Upgrades Done Right: Handling Dependencies Across ClustersIn a Data Engineering tech stack orchestration tool plays an very important role and most of the time it will be single point of failure —…Mar 27Mar 27
Combining SQL with GenAI on AthenaFor some use-cases it really make sense to integrate advanced machine learning models, such as Large Language Models (LLMs), into your data…Mar 21Mar 21
Published inDev GeniusRow level transactions on S3 Data LakeRow level operations were always tricky on immutable object storage based Data Lakes. To overcome this we have written ETLs to overwrite…Nov 23, 20222Nov 23, 20222
Published inDev GeniusOptimize S3 API cost for Data LakeIn the previous article, we discussed how to identify S3 data sets causing high S3 API costs. Once we have the list of tables next step is…Nov 8, 2022Nov 8, 2022
Automate Jupyter Notebook in ECS -1There are many ways to run Jupyter notebook in an automated way in AWS like EMR Studio or Glue Notebooks. Most of them are designed to be…Jun 10, 2021Jun 10, 2021
Hierarchical types in GoServices exposed by a hierarchical data model are node specific.Using those can be dynamic using type switches in Go, without scarifying…Nov 29, 2017Nov 29, 2017
Semi parallel for loop in GoStarting with same old sequential for loop.It will iterate through a slice if integers , sum it and print.Nov 29, 2017Nov 29, 2017