GLIMMER

GLIMMER

In bioinformatics, GLIMMER (Gene Locator and Interpolated Markov ModelER) is used to find genes in prokaryotic DNA. "It is effective at finding genes in bacteria, archea, viruses, typically finding 98-99% of all relatively long protein coding genes". GLIMMER was the first system that used the interpolated Markov model to identify coding regions. The GLIMMER software is open source and is maintained by Steven Salzberg, Art Delcher, and their colleagues at the Center for Computational Biology at Johns Hopkins University. The original GLIMMER algorithms and software were designed by Art Delcher, Simon Kasif and Steven Salzberg and applied to bacterial genome annotation in collaboration with Owen White. == Versions == === GLIMMER 1.0 === First Version of GLIMMER "i.e., GLIMMER 1.0" was released in 1998 and it was published in the paper Microbial gene identification using interpolated Markov model. Markov models were used to identify microbial genes in GLIMMER 1.0. GLIMMER considers the local composition sequence dependencies which makes GLIMMER more flexible and more powerful when compared to fixed-order Markov model. There was a comparison made between interpolated Markov model used by GLIMMER and fifth order Markov model in the paper Microbial gene identification using interpolated Markov models. "GLIMMER algorithm found 1680 genes out of 1717 annotated genes in Haemophilus influenzae where fifth order Markov model found 1574 genes. GLIMMER found 209 additional genes which were not included in 1717 annotated genes where fifth order Markov model found 104 genes."' === GLIMMER 2.0 === Second Version of GLIMMER i.e., GLIMMER 2.0 was released in 1999 and it was published in the paper Improved microbial identification with GLIMMER. This paper provides significant technical improvements such as using interpolated context model instead of interpolated Markov model and resolving overlapping genes which improves the accuracy of GLIMMER. Interpolated context models are used instead of interpolated Markov model which gives the flexibility to select any base. In interpolated Markov model probability distribution of a base is determined from the immediate preceding bases. If the immediate preceding base is irrelevant amino acid translation, interpolated Markov model still considers the preceding base to determine the probability of given base where as interpolated context model which was used in GLIMMER 2.0 can ignore irrelevant bases. False positive predictions were increased in GLIMMER 2.0 to reduce the number of false negative predictions. Overlapped genes are also resolved in GLIMMER 2.0. Various comparisons between GLIMMER 1.0 and GLIMMER 2.0 were made in the paper Improved microbial identification with GLIMMER which shows improvement in the later version. "Sensitivity of GLIMMER 1.0 ranges from 98.4 to 99.7% with an average of 99.1% where as GLIMMER 2.0 has a sensitivity range from 98.6 to 99.8% with an average of 99.3%. GLIMMER 2.0 is very effective in finding genes of high density. The parasite Trypanosoma brucei, responsible for causing African sleeping sickness is being identified by GLIMMER 2.0" === GLIMMER 3.0 === Third version of GLIMMER, "GLIMMER 3.0" was released in 2007 and it was published in the paper Identifying bacterial genes and endosymbiont DNA with Glimmer. This paper describes several major changes made to the GLIMMER system including improved methods to identify coding regions and start codon. Scoring of ORF in GLIMMER 3.0 is done in reverse order i.e., starting from stop codon and moves back towards the start codon. Reverse scanning helps in identifying the coding portion of the gene more accurately which is contained in the context window of IMM. GLIMMER 3.0 also improves the generated training set data by comparing the long-ORF with universal amino acid distribution of widely disparate bacterial genomes."GLIMMER 3.0 has an average long-ORF output of 57% for various organisms where as GLIMMER 2.0 has an average long-ORF output of 39%." GLIMMER 3.0 reduces the rate of false positive predictions which were increased in GLIMMER 2.0 to reduce the number of false negative predictions. "GLIMMER 3.0 has a start-site prediction accuracy of 99.5% for 3'5' matches where as GLIMMER 2.0 has 99.1% for 3'5' matches. GLIMMER 3.0 uses a new algorithm for scanning coding regions, a new start site detection module, and architecture which integrates all gene predictions across an entire genome." Minimum description length === Theoretical and Biological Foundation === The GLIMMER project helped introduce and popularize the use of variable length models in Computational Biology and Bioinformatics that subsequently have been applied to numerous problems such as protein classification and others. Variable length modeling was originally pioneered by information theorists and subsequently ingeniously applied and popularized in data compression (e.g. Ziv-Lempel compression). Prediction and compression are intimately linked using Minimum Description Length Principles. The basic idea is to create a dictionary of frequent words (motifs in biological sequences). The intuition is that the frequently occurring motifs are likely to be most predictive and informative. In GLIMMER the interpolated model is a mixture model of the probabilities of these relatively common motifs. Similarly to the development of HMMs in Computational Biology, the authors of GLIMMER were conceptually influenced by the previous application of another variant of interpolated Markov models to speech recognition by researchers such as Fred Jelinek (IBM) and Eric Ristad (Princeton). The learning algorithm in GLIMMER is different from these earlier approaches. == Access == GLIMMER can be downloaded from The Glimmer home page (requires a C++ compiler). Alternatively, an online version is hosted by NCBI [1]. == How it works == GLIMMER primarily searches for long-ORFS. An open reading frame might overlap with any other open reading frame which will be resolved using the technique described in the sub section. Using these long-ORFS and following certain amino acid distribution GLIMMER generates training set data. Using these training data, GLIMMER trains all the six Markov models of coding DNA from zero to eight order and also train the model for noncoding DNA GLIMMER tries to calculate the probabilities from the data. Based on the number of observations, GLIMMER determines whether to use fixed order Markov model or interpolated Markov model. If the number of observations are greater than 400, GLIMMER uses fixed order Markov model to obtain there probabilities. If the number of observations are less than 400, GLIMMER uses interpolated Markov model which is briefly explained in the next sub section. GLIMMER obtains score for every long-ORF generated using all the six coding DNA models and also using non-coding DNA model. If the score obtained in the previous step is greater than a certain threshold then GLIMMER predicts it to be a gene. The steps explained above describes the basic functionality of GLIMMER. There are various improvements made to GLIMMER and some of them are described in the following sub-sections. === The GLIMMER system === GLIMMER system consists of two programs. First program called build-imm, which takes an input set of sequences and outputs the interpolated Markov model as follows. The probability for each base i.e., A,C,G,T for all k-mers for 0 ≤ k ≤ 8 is computed. Then, for each k-mer, GLIMMER computes weight. New sequence probability is computed as follows. where n is the length of the sequence S x {\displaystyle S_{x}} is the oligomer at position x. I M M 8 ( S x ) {\displaystyle IMM_{8}(S_{x})} , the 8 t h {\displaystyle 8^{th}} -order interpolated Markov model score is computed as "where Y k ( S x − 1 ) {\displaystyle Y_{k}(S_{x-1})} is the weight of the k-mer at position x-1 in the sequence S and P k ( S x ) {\displaystyle P_{k}(S_{x})} is the estimate obtained from the training data of the probability of the base located at position x in the k t h {\displaystyle k^{th}} -order model." The probability of base S x {\displaystyle S_{x}} given the i previous bases is computed as follows. "The value of Y i ( S x ) {\displaystyle Y_{i}(S_{x})} associated with P i ( S x ) {\displaystyle P_{i}(S_{x})} can be regarded as a measure of confidence in the accuracy of this value as an estimate of the true probability. GLIMMER uses two criteria to determine Y i ( S x ) {\displaystyle Y_{i}(S_{x})} . The first of these is simple frequency occurrence in which the number of occurrences of context string S x , i {\displaystyle S_{x,i}} in the training data exceeds a specific threshold value, then Y i ( S x ) {\displaystyle Y_{i}(S_{x})} is set to 1.0. The current default value for threshold is 400, which gives 95% confidence. When there are insufficient sample occurrences of a context string, build-imm employ additional criteria to determine Y {\displaystyle Y} value. For a

Thinkfree Office

Thinkfree Office is a web-based commercial office productivity suite developed by South Korea-based Thinkfree Inc. It includes Word (a word processor), Spreadsheet (a spreadsheet) and Presentation (a presentation program). They are compatible with Microsoft Office's Word, PowerPoint, and Excel. It also features collaborative editing. The product is hosted on the client's server. == Supported file formats == Thinkfree Office supports ISO/IEC international standard ISO/IEC 26300 Open Document Format for Office Applications (odf, odt, odp, ods, odg). It also supports Microsoft's XML formats (docx, pptx, xlsx) and Microsoft's legacy binary formats (doc, ppt, xls). == Naming == The software was previously marketed under different names, such as Thinkfree Server, Thinkfree Online, Hancom Office Online, and Hancom Office Web. Eventually, the brand was consolidated under the name Thinkfree Office. == History == In June 2000, Thinkfree Inc. released Thinkfree Office, based in Silicon Valley, California. It is recognized as the world's first online office editor (predating Google Docs and Microsoft 365) and attracted significant media coverage, including reports on CNN. In 2001, Microsoft CEO Steve Ballmer highlighted Thinkfree as a significant competitor in a magazine interview, considering it a potential threat to his company, second only to Linux. In November 2003, Hancom, a South Korean office software company, signed a memorandum of understanding and subsequently acquired Thinkfree. In January 2004, Thinkfree expanded into other foreign markets. Subsidiary Haansoft USA, Inc. was created in San Jose, California to begin formal commercial operations in the US market. At the same time, a partnership was established with Riverdeep with the purpose of improving marketshare. In February 2004, expansion into the Japanese market began. A commercial agency agreement was signed with PSI in Shinjuku, Japan, which allowed for localized distribution. In addition, a global agreement was entered into with Yamada Denki, one of the three main computer distributors in Japan, for a total of 180,000 units. In May 2006, Thinkfree Office received the "Product of the Year" award at the Well-Connected Awards, USA. In January 2009, Thinkfree Mobile was launched at CES 2009 in Las Vegas. In April 2009, Thinkfree Live, Korea's first web office service, was launched. In June 2018, a partnership was formed with Amazon Web Services to integrate Thinkfree Office into WorkDocs, an in-house office suite. In October 2023, Hancom split its online office business unit as "Thinkfree Inc.".

GuideGeek

GuideGeek is an AI-powered travel assistant that was launched by travel publisher Matador Network in April 2023 and is accessed by users through Instagram, WhatsApp and Facebook Messenger to plan itineraries or provide travel tips and recommendations. It uses generative artificial intelligence technology from OpenAI. Matador Network is a San Francisco-based digital media company and online travel publication with millions of monthly visitors and social media followers. == Features == Users message GuideGeek questions about travel and receive customized answers and itineraries that are pulled from ChatGPT in addition to over 1,000 additional travel-specific integrations such as live flight, hotel and vacation rental data. Travelers can specify their budget and needs to generate custom itineraries. GuideGeek is not an app and does not require the user to download anything, instead relying on messaging apps such as Instagram to connect users with the AI. GuideGeek is free to use, doesn't include ads, and doesn't sell user data. Matador Network has a team of staff members monitoring conversations to correct them if the AI makes a false statement; for example, one user incorrectly inputted “Crete Freeze” instead of “Crete, Greece”, and the AI made up a fictional soft serve company. Using a technique known as reinforcement learning from human feedback (RLHF), the accuracy of GuideGeek increased to 98%, according to Matador Network CEO, Ross Borden. == Destination partnerships == Matador Network is monetizing GuideGeek via white-label partnerships with tourism bureaus and destination marketing organizations (DMOs). As of March 2024, it had over a dozen such clients. Estes Park, Colorado, was one of the first DMOs to partner with Matador for a custom version of GuideGeek called “Rocky Mountain Roamer.” For Discover Greece, Matador created Pythia, a custom AI named after the high priestess of the Temple of Apollo at Delphi. As Borden explained to Travel + Leisure, “Visitors to the Discover Greece website will find Pythia in the bottom right corner, and they can converse with the AI like a friend who knows everything about Greece.” Other DMOs who have partnerships with GuideGeek include the Aruba Tourism Authority, Visit Reno Tahoe, Illinois Office of Tourism, and Tourism Richmond. == Awards == In recognition of GuideGeek, Fast Company named Matador Network to its 2024 list of Most Innovative Companies. Following growth driven by the launch of GuideGeek, Matador Network was ranked on the 2024 Inc. 5000 list of fastest-growing private companies in America. The 2024 Skift IDEA Awards recognized Matador Network as a finalist in the category of Best Use of AI for GuideGeek's customized AI for the travel industry. == Michael Motamedi experiment == Travel influencer and chef Michael Motamedi traveled the world with his wife Vanessa Salas and their 2-year-old daughter on a six-month trip (which was later extended to a full year) led by GuideGeek. The family started off in Morocco before heading to Spain and continuing east. The experiment became the basis of a web series called “No Fixed Address.” Motamedi used GuideGeek's AI to select countries the family visited, where they ate, and what sites they saw. Motamedi and Salas first tested out the technology in April 2023 while using the chatbot to plan a date night in Mexico City. GuideGeek provided speakeasy and drink recommendations as well as local history facts.

DreamBooth

DreamBooth is a deep learning generation model used to personalize existing text-to-image models by fine-tuning. It was developed by researchers from Google Research and Boston University in 2022. Originally developed using Google's own Imagen text-to-image model, DreamBooth implementations can be applied to other text-to-image models, where it can allow the model to generate more fine-tuned and personalized outputs after training on three to five images of a subject. == Technology == Pretrained text-to-image diffusion models, while often capable of offering a diverse range of different image output types, lack the specificity required to generate images of lesser-known subjects, and are limited in their ability to render known subjects in different situations and contexts. The methodology used to run implementations of DreamBooth involves the fine-tuning the full UNet component of the diffusion model using a few images (usually 3--5) depicting a specific subject. Images are paired with text prompts that contain the name of the class the subject belongs to, plus a unique identifier. As an example, a photograph of a [Nissan R34 GTR] car, with car being the class); a class-specific prior preservation loss is applied to encourage the model to generate diverse instances of the subject based on what the model is already trained on for the original class. Pairs of low-resolution and high-resolution images taken from the set of input images are used to fine-tune the super-resolution components, allowing the minute details of the subject to be maintained. == Usage == DreamBooth can be used to fine-tune models such as Stable Diffusion, where it may alleviate a common shortcoming of Stable Diffusion not being able to adequately generate images of specific individual people. Such a use case is quite VRAM intensive, however, and thus cost-prohibitive for hobbyist users. The Stable Diffusion adaptation of DreamBooth in particular is released as a free and open-source project based on the technology outlined by the original paper published by Ruiz et. al. in 2022. Concerns have been raised regarding the ability for bad actors to utilise DreamBooth to generate misleading images for malicious purposes, and that its open-source nature allows anyone to utilise or even make improvements to the technology. In addition, artists have expressed their apprehension regarding the ethics of using DreamBooth to train model checkpoints that are specifically aimed at imitating specific art styles associated with human artists; one such critic is Hollie Mengert, an illustrator for Disney and Penguin Random House who has had her art style trained into a checkpoint model via DreamBooth and shared online, without her consent.

SAS Viya

SAS Viya is an artificial intelligence, analytics and data management platform developed by SAS Institute. == History == SAS Viya was released in 2016. The software was containerized with the release of Viya 4 in 2020. Viya has become one of SAS' most widely used platforms during the AI boom, as artificial intelligence becomes more widely used in business and computing. == Technical overview == The platform is cloud-native, and is executed on SAS's Cloud Analytics Services (CAS) engine. It is compatible with open source software, allowing users to build models using open sources tool such as R, Python and Jupyter. It integrates with major large language models like GPT-4 and Gemini Pro. The platform uses econometrics to create predictive models for forecasting scenarios based on complex data. It also has features for detecting algorithmic bias, auditing decisions and monitoring models. It is implemented through a low-code, no-code platform. The software is available on Amazon AWS Marketplace, Google Cloud, Red Hat OpenShift, and on Microsoft Azure Marketplace under a pay-as-you-use model. == Software == SAS Viya has released software as a service (SaaS) modules for creating AI content. These include Viya Workbench, Viya App Factory, Viya Copilot, and SAS Data Maker. The company also develops industry specific models, used by companies including Georgia-Pacific. == Applications == === Banking === The software is also widely used in business, especially in areas such as predictive modelling and fraud detection. === Insurance === SAS Viya is used in insurance for tasks such as actuarial analytics and modelling, as well as regulatory reporting. === Healthcare and life sciences === In 2023, the company introduced SAS Health, a common health data model built on the SAS Viya platform. AstraZeneca has partnered with SAS to use SAS Viya and SAS Life Science Analytics Framework in its delivery and approval processes. In 2024, SAS partnered with the University of Cambridge's Maxwell Center to use SAS Viya for healthcare research and development. === Public sector === SAS Viya is used in partnership with national and local governments to provide services and detect tax fraud. === Education === SAS Viya is used in research and education, particularly studies related to business intelligence, cybersecurity and data management. SAS Institute has partnered with educational institutions such as Appalachian State University, Clemson University, University of Arkansas, Stockholm University, and Marian University, to provide access to and training for using SAS Viya.

Stairstep interpolation

In the field of image processing, stairstep interpolation is a widely employed method technique for interpolating pixels after enlarging an image. The fundamental concept is to interpolate multiple times, in small increments, using any interpolation algorithm that is better than nearest-neighbor interpolation such as; bilinear interpolation, and bicubic interpolation. A common scenario is to interpolate an image by using a bicubic interpolation which increases the image size by no more than 10% (110% of the original size) at a time until the desired size is reached. Fred Miranda, a developer, popularized this method by creating and developing several Photoshop plug-ins that incorporate this technique. == Example ==

Lobsang Monlam

Geshe Lobsang Monlam (Tibetan: དགེ་བཤེས་བློ་བཟང་སྨོན་ལམ, Wylie: dge bshes blo bzang smon lam), born in 1976 in Ngawa eastern Tibet, is a Tibetan Buddhist scholar and programmer who uses digital technologies to preserve the Tibetan language and culture. He is best known for developing Tibetan typefaces and for the multi-volume Great Monlam Tibetan Dictionary. In 2025, he received the Snow Lion Award for Human Rights from the International Campaign for Tibet. He is also working on developing a "Dalai Lama AI," a specialized language model. == Biography == Lobsang Monlam was born in 1976 in Ngawa, eastern Tibet, anciently Tibetan Amdo, where he became a monk at the age of 12.. At the age of 17, in 1993, Lobsang Monlam fled Tibet by crossing the Himalayas to reach southern India and discovered computer science in a monastery. In 1993, he was ordained monk in the Sera Mey College in Bylakuppe, Karnataka, India, where he obtained a Geshe title in 2013.. By the early 2000s, Lobsang Monlam had already learned to paint thangkas and to compose plans and drawings. He used this knowledge to design a new assembly hall for Sera Mey, which the monks needed. Thanks to his work, Lobsang Monlam received donations from patrons of the monastery, which he was able to use to buy his first computer. He bought his first laptop in 2002 and largely taught himself how to use the hardware and software with the help of manuals. As a Buddhist scholar, he combines meditation practice with his digital work. In 2012, he founded and directs the Monlam Tibetan Information Technology Research Center in Dharamsala, which specializes in Tibetan language and software projects. Since then, he is its director, researching Tibetan language-related software. In 2019, advised by the 14th Dalai Lama, he founded Monlam IT and Research (OPC) Private Limited. Since the 2000s, Monlam has been developing Tibetan typefaces; the first Monlam Tibetan font was created in 2005. Under his direction, the Monlam Great Tibetan Dictionary was created, comprising 223 printed volumes and over 300,000 entries; approximately 150 people worked on this project for over nine years. On May 27, 2022, the Dalai Lama inaugurated the Monlam Tibetan Dictionary, produced by the Monlam Tibetan Information Technology Research Center, at Namgyal Monastery in McLeod Ganj. According to Penpa Tsering, this is the world's largest dictionary, created with guidance from the Dalai Lama, based on proposals from Lobsang Monlam and his team under the direction of Samdhong Rinpoche, and other lamas from all schools of Tibetan Buddhism and Yungdrung Bön. On December 5, 2024, Lobsang Monlam testified at a hearing of the US Congressional-Executive Commission on China in Washington, chaired by Christopher Smith, on the difficulties of preserving the Tibetan language and culture in Tibet and the Tibetan diaspora, and on the interest of the Monlam Tibetan Informatics Research Center in developing technologies for the preservation of the Tibetan language. On December 12, 2024, the work was presented to the Library of Congress in Washington, D.C., and launched at an event. The free Monlam Great Tibetan Dictionary app is available in several languages; the German version was created in collaboration with the Tibet Institute Rikon and has been downloaded millions of times. In total, Monlam has created over 37 apps related to the Tibetan language and translation; In 2023, its center launched the Monlam artificial intelligence platform, equipped with modules for machine translation, optical character recognition, speech transcription and speech synthesis.. For their efforts, he and Sophie Richardson received the Snow Lion Award in 2025, which was presented by Richard Gere and came with a prize of €3,000. In 2019, he started a PhD at Bangalore University on Library Science. He obtained his doctorate on November 30, 2023. Currently, he spearheads Monlam AI. Lobsang Monlam is developing "Dalai Lama AI" to digitally preserve the teachings of the 14th Dalai Lama, now 90 years old, for future generations. Lobsang Monlam states, "If we succeed in preserving the Dalai Lama, we also preserve the movement."