:::

【2020 Application Example】 AI Voice Synthesis Module, Bringing Warmth to Machine Narration

In response to current trends, digital learning and mobile educational materials have attracted widespread attention!

With rapid technological advancements, effectively nurturing professionals who can 'adapt to developmental changes' is a critical concern that many businesses continually consider. Over recent years, various enterprises have progressively integrated 'digital learning' into employee training programs to enhance educational outcomes, thus bringing 'digital learning' and 'mobile educational materials' into the limelight.

Outsourced narration is costly and cannot handle large volumes of demand

Differences in the digital educational material production process before and after the implementation of the AI voice synthesis system

▸ Differences in the digital educational material production process before and after the implementation of the AI voice synthesis system

Strategic Breakthrough Corporation of Taiwan has assisted companies in converting many seminars, physical courses, and training events conducted by public sectors into digital materials in the past years. However, during the conversion process, it required inviting teachers, finding and renting filming locations, and post-production of recordings and videos. During recording, issues such as speakers' nervousness, discomfort in front of cameras, or mispronunciations might lead to poor recording quality or constant retakes.

Though there was an option to provide customer-specific educational material narration, the outsourcing costs were high and could not handle the demand efficiently. Therefore, there was a hope to introduce AI speech synthesis technology and develop an 'Intelligent Voice Synthesis Module' to instantly convert text on slides into natural, human-like voice files, thus saving on narration costs.

Realistic Intelligent Voice Synthesis Module, providing a diversified selection of voices

AI Voice Synthesis Module Illustration

▸ AI Voice Synthesis Module Illustration

Strategic Corporation of Taiwan collaborated with the AI technology team, Magic Cube Digital Ltd., using Tacotron2 combined with WaveNet and Tacotron features. Characters are embedded into Mel-scale spectrogram plots, then a modified WaveNet model acting as the vocoder synthesizes waveform in the time domain from these spectrograms, finally developing an MOS (Mean Opinion Score) for voice quality evaluation that approximates human-like intelligent voice synthesis modules.

This AI Intelligent Voice Synthesis Module, after being tested by testers using the MOS voice quality evaluation standard, received a score of 4.3, meeting the initial project target score of 4.21 and surpassing WaveNet's score of 4.08, thereby demonstrating exceptional effectiveness!

AI Intelligent Voice Synthesis Module, reducing costs and increasing profits, will effectively enhance Taiwan's digital learning industry environment!

Costs have been significantly reduced after the implementation of the AI voice system, and profits have increased relatively

▸ Costs have been significantly reduced after the implementation of the AI voice system, and profits have increased relatively

This AI Intelligent Voice Synthesis Module not only reduces the cost of producing digital educational materials but also solves the difficulties faced by Taiwan's industry, government, and academia in spreading digital educational materials. It can effectively enhance the efficiency of customers in producing digital teaching materials, significantly reduce labor shortages, and cost structural risks, and improve profitability.

Strategic Corporation of Taiwan will also continue to develop the 'Intelligent Transcription Module' and introduce Robotic Process Automation (RPA) to replace the current manual processes, such as captioning, dubbing, and file conversion in the production of digital educational materials, assisting in the transformation and enhancement of the domestic digital learning industry.

「Translated content is generated by ChatGPT and is for reference only. Translation date:2024-05-19」

Recommend Cases

這是一張圖片。 This is a picture.
CCTV Intelligent Video Search System

Search for a specific person, find someone with a suitcase entering the factory in Gao'an area Color features of the person and the object confirmed, person in blue and black top, suitcase in black color, throughCCTV the intelligent video search system, by setting object and color retrieval conditions, it can successfully locate three video clips containing the target subject This greatly aids operational staff in finding the target items, and through this system, search speed can far surpass manual effort6fold Pain Points The CSE-Kaohsiung Plant is densely equippedCCTVto monitor every corner of the plant area, but when an incidenthappens, it's impossible within a limited time throughCCTVvideo playback to find the incident, the implications and risks behind this are self-evident Many areas that are usually unmanned can easily become security blind spots Thus, how to monitor a vast plant area more intelligently and effectively is one of the crucial aspects of building a smart plant for the semiconductor industry The AES Plant in Kaohsiung covers a vast area, with many important sites requiring monitoring of personnel movements to ensure corporate secrets and employee safety 1 Automated production lines and warehouses In semiconductor enterprises’ automated production lines and warehouses, oftenAGV(Automated Guided VehicleAGVs automated guided vehicles travel at high speeds if plant personnel inadvertently enterAGVthe moving area and cannot issue a warning to the person, then the regrettable accidents that occur will be too late to reverse 2 Material and product storage areas Materials used in semiconductor-related processes are costly if areas storing materials or products are breached, there is a risk of loss of high-value materialsproducts 3 High-security areas Trade secrets relate to the core technological competitiveness of semiconductor-related enterprises if someone breaches the high-security areas, there is a risk of corporate secrets being leaked The safety of trade secrets has always been one of the most critical issues for semiconductor enterprises 4 Loading docks At AESLButthe dock area often has loading vehicles coming and going if someone intrudes into the dock area, there is a risk of vehicle collisions and accidents Additionally, goods awaiting shipment at the dock area could be stolen or potentially damaged from collisions, thus causing significant reputation and financial losses for the company, further leading to production and shipping inconvenience When an abnormal event occurs, how to quickly search for the relevant key footage from massive data Many important locations within the AES Kaohsiung Plant need to be equippedCCTVfor safety checks, butCCTVWith thousands to tens of thousands of cameras, manually searching through footage for an event requires laborious frame-by-frame review which is time-consuming and inefficient In light of advancements in computer vision, it's beneficial to utilizeAIto replace manual playback and searching Problem Scenario Object Detection The data source for object detection comprises two parts Open-source datasetsOIDv4and AES Kaohsiung PlantCCTVImage files For these files, search for usable data, specificallyOIDv4image files For these files, extract the defined nine major categories of objects for training data among them, two object categories, knives and gasoline barrels, were not found inOIDv4found usable data for knives and gasoline barrels, while the remaining seven categories of objects are available fromOIDv4useful training data found for the remaining seven categories of objects, all marked Regarding the Kaohsiung PlantCCTVimage files, select some frames Frame of the footage, and manually annotate the objects to be_detected for training and testing data Nine Major Objects Color Recognition The data source for color recognition is divided into two partsInternet image screenshots, and Kaohsiung PlantCCTVimage files Currently, no publicly available open-source datasets specifically for color recognition applications have been found, so images are collected from the web Search the web for images of the defined nine major object categories, save the images after separating the objects from the background, keeping only the object sections, and mark the images according to color Additionally, for the Kaohsiung PlantCCTVimage files, use the already-markedbounding boxextractCCTVimage files from variousFramesections of objects identified by color, and finally, visually identifiable images are marked according to color Each object category has its specific color definition, depending on the usual colors seen in these objects in real life Dynamic Ignore during Training FromOIDv4during the training of the object detection pilot model, since each image in this dataset is only marked for a single category, but the image may contain other desired detection categories unmarked For such cases, dynamic ignore techniques will be employed during training to avoid confusion Next, use the extracted training data from the Kaohsiung Plant toFine-Tuneenhance the detection rate of the object in specific designated areas Finally, select the model that computes the lowest loss value in the test set during the training process as the main object_detection model Dynamic Ignoring AIHelp You View CCTV The intelligent video search system primarily serves as an assistive system for searching surveillance footage, capable of speeding up the process of finding target events by setting search conditions for objects By simply defining the search conditions, you can quickly produce thumbnails of critical objects and playback for review, shortening the time required for manual case retrieval of the past The search time is quickly6doubled, allowing the front-end security unit to use this platform to strengthen the first line of risk management supervision and take timely preventive measures 「Translated content is generated by ChatGPT and is for reference only Translation date:2024-12-12」

這是一張圖片。 This is a picture.
Realizing the dream of unmanned stores, Magpie Life is building the future of the smartphone industry

"The DNA of Magpie Life is not limited to vending machines We believe that vending machines combine technology, access, and humanities to bring us exciting results" This is a sentence on the official website of Magpie Life Let the vending machines bring To live a pleasant life and build a considerate, technological and sustainable future for the smartphone industry is also the original intention of Magpie Life Founded in 2018, Magpie Life launched Taiwan’s first private-brand mobile payment scan code sensor 4 months after its establishment, completing the consumption experience through screen touch The Magpie U1 smart vending machine manages the POS system and gathers data in the background, allowing consumers to synchronize with the world's new retail pace and experience a new retail consumption experience of purchasing convenience, checkout security, visual entertainment, and improved logistics replenishment efficiency Traditional vending machines lack information visibility and AI technology assists in information transparencyThis time, the Magpie smart vending machine is also equipped with AI technology to provide adjustable shelf space , a vending machine equipped with an industrial computer and a large-size touch display screen to achieve the purpose of a store-less store Magpie Life stated that the biggest problem with traditional vending machines is the lack of information visibility To check inventory, replenishment personnel must physically inspect each machine, which is time-consuming and costly When a machine breaks down, it will generally be unable to operate for a long time Most failures go unreported and are not discovered until the next restocking crew arrives to replenish supplies Then you have to wait for a service technician to be scheduled, which can take weeks Traditional vending machines lack real-time interactivity When consumers encounter problems after inserting coins, manufacturers cannot handle them immediately In addition, traditional vending machines are less flexible and cannot adapt to changes in consumer preferences Traditional vending machines have shortcomings such as limited change shopping, single payment tools, limited number of products, and few choices Affected by the COVID-19 epidemic, consumption habits have shifted to contactless methods, causing the unmanned store market to heat up Generally, vending machines can only place relatively simple products such as drinks, food, etc The properties available for sale are limited The patented vending machine developed by Magpie can adjust the shelf space and is equipped with a lifting cargo elevator, which is suitable for various types of goods In addition, the machine is equipped with an industrial computer and a large-size touch display screen, which can meet the needs of advertising support at the same time It is expected to move towards a storeless store According to Magpie Life Observation, the consumer market trend in the past two years is that consumers demand convenient life, food consumption patterns value dining experiencesimple and fast, and are equipped with mobile phone-connected ordering models, and hot drinks and Fresh food delivery is the focus of two major trends The location, items sold, consumption methods and multiple payment methods are the focus of market growth for smart vending machines In terms of convenience, Taiwanese consumers still prefer to purchase vending machine food near stations, airports, schools, and businesses in business districts Various payment methods are also gaining more support from consumers, indicating that in the future, automatic Vending machines can be developed in two directions diversified items and diversified payment methods AI sales forecast technology integrates back-end management to achieve precise marketing purposesDue to the wide variety of products, it is difficult to know the performance of products under different factors such as season, market conditions , promotional activities, etc, it is easy to cause out-of-stock or over-inventory situations Magpie Life has specially developed "AI sales forecasting technology" and integrated it into the back-end management system, hoping to lock in customer purchasing preferences and intentions through data analysis In order to achieve the purpose of precise marketing, make accurate business decisions and effectively allocate limited resources The introduction of AI systems can achieve the three major goals of precise marketing, inventory management and supply chain management This system is a replenishment decision-making aid designed specifically for supply chain managers It uses AI to predict future sales demand, helping companies effectively optimize production capacity, inventory and distribution strategies Its overall system architecture includes1 Data exploratory analysis function Provides automatic value filling, automatic coding and automatic feature screening functions for missing values in the data 2 Modeling function 1 Provides model training functions for two types of prediction problems regression Regression and time series Time Series Forecast nbsp2 Supports Auto ML automatic modeling, and the best model is recommended by the system Integrated models can also be established to improve model accuracy nbsp3 Supports multiple algorithm types Random Forest, XGBoost, GBM and other algorithms nbsp4 Supports a variety of time series models exponential smoothing, ARIMA, ARIMAX, intermittent demand, dynamic multiple regression and other models nbsp5 Supports a variety of model evaluation indicators R, MAE, MSE, RMSE, Deviance, AUC, Lift top 1, Misclassification and other indicators nbsp6 Supports automatic cutting of training data sets and Holdout verification data sets, and can manually adjust the ratio nbsp7 Supports automatic model ensemble learning Stacked Ensemble, balancing function learning Balancing Classes, and Early Stopping nbsp8 Supports the creation of multiple models at the same time The system will allocate resources according to modeling needs, so that modeling, prediction and other tasks have independent computing resources and do not affect each other In the overall server space With an upper limit, computing resources can be used efficiently nbsp9 It has in-memory computing function, which can use large-capacity and high-speed memory to perform calculations to avoid reading and writing a large number of files from the hard disk and improve computing performance 3 Data concatenation function Using API grafting and complete data concatenation automation, there is no need to manually import data, improving user experience 4 Chart analysis function Provides visual charts and basic statistical values for product sales AI data analysis solutions have two major advantages 1 Entrepreneurship machines can be rented and sold at low cost to open unmanned physical stores and cooperate with the chain retail industry Through smart machines, entrepreneurs can rent and sell them at a lower cost than the store rent Cost of running a retail business Two cooperation models, machine sales and leasing, are provided, and the choice is based on the evaluation of the industry 2 Various types of products are put on the shelves Products are sold anytime and anywhere 24 hours a day Up to 60 kinds of diversified products can be put on the shelves Large transparent windows enhance the visibility of products Regular replenishment and tracking of product sales status are available, and product types can be adjusted according to needs Recently, the line between the Internet and the physical world has blurred, the way customers interact has changed significantly, and consumer demand is changing and personalized The retail industry is facing unprecedented challenges and uncertainties, and mastering data has become key AI data analysis solutions can help the retail industry quickly activate large amounts of data, create seamless personalized experiences, optimize the operational value chain and improve efficiency, and strengthen the core competitiveness of enterprises 「Translated content is generated by ChatGPT and is for reference only Translation date:2024-05-19」

【導入案例】化身大型AIOT科技遊樂場 海科館華麗轉身好吸睛
Transforming into a Large-Scale AIoT Technology Playground: The Spectacular Makeover of the National Museum of Marine Science & Technology

Taiwan is a maritime nation When you visit the Badozi Fishing Port or Tidal Park in Keelung, do you also explore the mysteries of the ocean world at the 48-hectare National Museum of Marine Science amp Technology To get more people closer to marine technology, Keelung's Marine Museum has introduced technological services, transforming the venue into a large technology playground that delights both children and adults, fully utilizing the 'learning through play' approach After a lengthy planning process, Northern Taiwan's largest marine science museum in Keelung opened in January 2014 The museum focuses on marine education and technology, boasting Taiwan's largest IMAX 3D ocean theater The unique themes and modern viewing facilities should make it a well-known landmark in Keelung However, the original exhibition planning was static and highly specialized, lacking sufficient interaction with the public Visitors who have attended the museum also reported that the exhibits were limited and quite boring, leading to poor overall consumer experience ratings The top three dissatisfactions with the museum were weak connections to surrounding attractions, unengaging display content, and lack of exhibit material According to statistics from the Marine Museum, the ratio of local to visiting guests is approximately 64, with most foreign visitors coming from the north transportation is primarily by car and bus common types of visits include family, parent-child, and friends and the stay duration is generally 1 to 2 hours Upon deeper investigation, the top three visitor complaints were weak linkages to surrounding attractions, unengaging display content, and insufficient number of exhibits The museum analyzed potential reasons, including some displays being too specialized, making it difficult for the public to understand, and a lack of interactive elements, making the exhibition boring and the visit hurriedly brief Analysis of visitor profiles revealed that since half of the museum's visitors are locals, and accessing the museum is not so easy for out-of-towners who must travel by car or public transport, the design of the venue and exhibitions must incorporate more interactivity and intrigue to encourage locals to return and extend the duration of visitors' stays while using technological services to highlight the museum's unique features Through a recommendation from the Information Software Association, part of the Ministry of Economic Affairs' Industrial Bureau AI team, the Marine Museum commissioned Jugu Technology to resolve the issue of uninspiring venue attractions Preliminary interviews by Jugu Technology revealed that many visitors were attracted by the architectural design of the museum, notices posted on nearby walls, flags, or events being held the most interesting feature for visitors was the 3D ocean theater, indicating that content presented through audio-video and physical scenic methods was more engaging Seven major AI technologies lead to a boost in regional tourism at the Marine Museum Through the introduction of technology services, Jugu Technology designed the 48-hectare site with seven major services AI voice tours, treasure hunt puzzle games, AI exhibit interactive revitalization, AI space exhibition interactive experience, AI crowd control, Face AI interactive experience, and AI voice customer service system By utilizing AIoT and cloud technology, they made the exhibition more interesting, not only solving the issue of boring static viewings for children but also doubling the learning efficiency and dramatically improving public perception of the Marine Museum, thus increasing visitor intent and boosting regional tourism The National Museum of Marine Science and Technology introduced seven major technological application services including AI voice guide Jugu Technology aimed to improve the space optimization of the Marine Museum, using the special exhibition of coastal birds in northern Taiwan as a prototype, integrating 'face', 'limb', 'crowd' as three main axes to enhance functionality and assist in improving the museum's application of AI Practically, the Marine Museum and Jugu Technology selected the on-site special exhibits to avoid any installation of water and electricity works or pipelines in active exhibits, thereby maintaining the quality of the viewing experience Instead, they selected exhibits that were not yet open to introduce a series of technological services tailored to the unique characteristics of the exhibits In the coastal bird special exhibition inside the Marine Museum, initial construction discussions with the curators utilized Bella X1 for a welcoming interactive introduction at the exhibition entrance This was followed by an AI-powered smart guide in both Chinese and English using X1 for narration, coupled with a fun treasure hunting stamp-collecting activity - APP X1, allowing visitors to participate in challenges Subsequently, bird species within the bird exhibition were brought to life interactively using X1, and AR scenarios X1 were introduced into the exhibition space to add elements of fun and entertainment Finally, Face AI was used to interactively test facial expressions and score smiles The gorgeously transformed Marine Museum will become the best travel destination for families with children ImageMarine Museum FB Page The AIoT services introduced by the Marine Museum could be extended to various exhibition-type museums and even static art galleries in the future, tailored to the unique characteristics of different venues They could also be promoted through government projects and related plans, aiding in rural revitalization, making visits more than just sightseeing in rural areas, and breaking free from stereotypes associated with different venues The applications of these services are broad「Translated content is generated by ChatGPT and is for reference only Translation date:2024-05-19」